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PREFACE 


My  aim  in  writing  this  book  has  been  to  provide  an  introductory  text- 
book of  quantitative  genetics,  with  the  emphasis  on  general  principles 
rather  than  on  practical  application,  and  one  moreover  that  can  be 
understood  by  biologists  of  no  more  than  ordinary  mathematical 
ability.  In  pursuit  of  this  latter  aim  I  have  set  out  the  mathematics  in 
the  form  that  I,  being  little  of  a  mathematician,  find  most  compre- 
hensible, hoping  that  the  consequent  lack  of  rigour  and  elegance  will 
be  compensated  for  by  a  wider  accessibility.  The  reader  is  not,  how- 
ever, asked  to  accept  conclusions  without  proof.  Though  only  the 
simplest  algebra  is  used,  all  the  mathematical  deductions  essential 
to  the  exposition  of  the  subject  are  demonstrated  in  full.  Some 
knowledge  of  statistics,  however,  is  assumed,  particularly  of  the  ana- 
lysis of  variance  and  of  correlation  and  regression.  Elementary 
knowledge  of  Mendelian  genetics  is  also  assumed. 

I  have  had  no  particular  class  of  reader  exclusively  in  mind,  but 
have  tried  to  make  the  book  useful  to  as  wide  a  range  of  readers  as 
possible.  In  consequence  some  will  find  less  detail  than  they  require 
and  others  more.  Those  who  intend  to  become  specialists  in  this 
branch  of  genetics  or  in  its  application  to  animal  or  plant  breeding 
will  find  all  they  require  of  the  general  principles,  but  will  find  little 
guidance  in  the  techniques  of  experimentation  or  of  breeding 
practice.  Those  for  whom  the  subject  forms  part  of  a  course  of 
general  genetics  will  find  a  good  deal  more  detail  than  they  require. 
The  section  headings,  however,  should  facilitate  the  selection  of  what 
is  relevant,  and  any  of  the  following  chapters  could  be  omitted  without 
serious  loss  of  continuity  :  Chapters  4,  5,  10  (after  p.  168),  12,  13, 
and  15-20. 

The  choice  of  symbols  presented  some  difficulties  because  there 
are  several  different  systems  in  current  use,  and  it  proved  impossible 
to  build  up  a  self-consistent  system  entirely  from  these.  I  have 
accordingly  adopted  what  seemed  to  me  the  most  appropriate  of  the 
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symbols  in  current  use,  but  have  not  hesitated  to  introduce  new 
symbols  where  consistency  or  clarity  seemed  to  require  them.  I 
hope  that  my  system  will  not  be  found  unduly  confusing  to  those 
accustomed  to  a  different  one.  There  is  a  list  of  symbols  at  the  end, 
where  some  of  the  equivalents  in  other  systems  are  given. 
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INTRODUCTION 

Quantitative  genetics  is  concerned  with  the  inheritance  of  those  differ- 
ences between  individuals  that  are  of  degree  rather  than  of  kind, 
quantitative  rather  than  qualitative.  These  are  the  individual  differ- 
ences which,  as  Darwin  wrote,  "afford  materials  for  natural  selection 
to  act  on  and  accumulate,  in  the  same  manner  as  man  accumulates  in 
any  given  direction  individual  differences  in  his  domestic  produc- 
tions." An  understanding  of  the  inheritance  of  these  differences  is  thus 
of  fundamental  significance  in  the  study  of  evolution  and  in  the  appli- 
cation of  genetics  to  animal  and  plant  breeding;  and  it  is  from  these 
two  fields  of  enquiry  that  the  subject  has  received  the  chief  impetus  to 
its  growth. 

Virtually  every  organ  and  function  of  any  species  shows  individual 
differences  of  this  nature,  the  differences  of  size  among  ourselves  or 
our  domestic  animals  being  an  example  familiar  to  all.  Individuals 
form  a  continuously  graded  series  from  one  extreme  to  the  other  and 
do  not  fall  naturally  into  sharply  demarcated  types.  Qualitative 
differences,  in  contrast,  divide  individuals  into  distinct  types  with 
little  or  no  connexion  by  intermediates.  Examples  are  the  differ- 
ences between  blue-eyed  and  brown-eyed  individuals,  between  the 
blood  groups,  or  between  normally  coloured  and  albino  individuals. 
The  distinction  between  quantitative  and  qualitative  differences 
marks,  in  respect  of  the  phenomena  studied,  the  distinction  between 
quantitative  genetics  and  the  parent  stem  of  "Mendelian"  genetics. 
In  respect  of  the  mechanism  of  inheritance  the  distinction  is  between 
differences  caused  by  many  or  by  few  genes.  The  familiar  Mendelian 
ratios,  which  display  the  fundamental  mechanism  of  inheritance,  can 
be  seen  only  when  a  gene  difference  at  a  single  locus  gives  rise  to  a 
readily  detectable  difference  in  some  property  of  the  organism. 
Quantitative  differences,  in  so  far  as  they  are  inherited,  depend  on  gene 
differences  at  many  loci,  the  effects  of  which  are  not  individually  dis- 
tinguishable. Consequently  the  Mendelian  ratios  are  not  exhibited 
by  quantitative  differences,  and  the  methods  of  Mendelian  analysis 
are  inappropriate. 
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It  is,  nevertheless,  a  basic  premiss  of  quantitative  genetics  that  the 
inheritance  of  quantitative  differences  depends  on  genes  subject  to 
the  same  laws  of  transmission  and  having  the  same  general  properties 
as  the  genes  whose  transmission  and  properties  are  displayed  by- 
qualitative  differences.  Quantitative  genetics  is  therefore  an  extension 
of  Mendelian  genetics,  resting  squarely  on  Mendelian  principles  as  its 
foundation. 

The  methods  of  study  in  quantitative  genetics  differ  from  those 
employed  in  Mendelian  genetics  in  two  respects.  In  the  first  place, 
since  ratios  cannot  be  observed,  single  progenies  are  uninformative, 
and  the  unit  of  study  must  be  extended  to  "populations,"  that  is 
larger  groups  of  individuals  comprising  many  progenies.  And,  in  the 
second  place,  the  nature  of  the  quantitative  differences  to  be  studied 
requires  the  measurement,  and  not  just  the  classification,  of  the  indi- 
viduals. The  extension  of  Mendelian  genetics  into  quantitative  gene- 
tics may  thus  be  made  in  two  stages,  the  first  introducing  new  con- 
cepts connected  with  the  genetic  properties  of  "populations"  and  the 
second  introducing  concepts  connected  with  the  inheritance  of 
measurements.  This  is  how  the  subject  is  presented  in  this  book.  In 
the  first  part,  which  occupies  Chapters  i  to  5,  the  genetic  properties  of 
populations  are  described  by  reference  to  genes  causing  easily  identi- 
fiable, and  therefore  qualitative,  differences.  Quantitative  differences 
are  not  discussed  until  the  second  part,  which  starts  in  Chapter  6. 
These  two  parts  of  the  subject  are  often  distinguished  by  different 
names,  the  first  being  referred  to  as  "Population  Genetics"  and  the 
second  as  "Biometrical  Genetics"  or  "Quantitative  Genetics." 
Some  writers,  however,  use  "Population  Genetics"  to  refer  to  the 
whole.  The  terminology  of  this  distinction  is  therefore  ambiguous. 
The  use  of  "Quantitative  Genetics"  to  refer  to  the  whole  subject  may 
be  justified  on  the  grounds  that  the  genetics  of  populations  is  not  just 
a  preliminary  to  the  genetics  of  quantitative  differences,  but  an  in- 
tegral part  of  it. 

The  theoretical  basis  of  quantitative  genetics  was  established 
round  about  1920  by  the  work  of  Fisher  (19 18),  Haldane  (1924-32, 
summarised  1932)  and  Wright  (1921).  The  development  of  the 
subject  over  the  succeeding  years,  by  these  and  many  other  gene- 
ticists and  statisticians,  has  been  mainly  by  elaboration,  clarifica- 
tion, and  the  filling  in  of  details,  so  that  today  we  have  a  substantial 
body  of  theory  accepted  by  the  majority  as  valid.  As  in  any  healthily 
growing  science,  there  are  differences  of  opinion,  but  these  are  chiefly 
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matters  of  emphasis,  about  the  relative  importance  of  this  or  that 
aspect. 

The  theory  consists  of  the  deduction  of  the  consequences  of 
Mendelian  inheritance  when  extended  to  the  properties  of  popula- 
tions and  to  the  simultaneous  segregation  of  genes  at  many  loci.  The 
premiss  from  which  the  deductions  are  made  is  that  the  inheritance  of 
quantitative  differences  is  by  means  of  genes,  and  that  these  genes 
are  subject  to  the  Mendelian  laws  of  transmission  and  may  have  any 
of  the  properties  known  from  Mendelian  genetics.  The  property  of 
"variable  expression"  assumes  great  importance  and  might  be  raised 
to  the  status  of  another  premiss:  that  the  expression  of  the  genotype 
in  the  phenotype  is  modifiable  by  non-genetic  causes.  Other  pro- 
perties whose  consequences  are  to  be  taken  into  account  include 
dominance,  epistasis,  pleiotropy,  linkage,  and  mutation. 

These  theoretical  deductions  enable  us  to  state  what  will  be  the 
genetic  properties  of  a  population  if  the  genes  have  the  properties 
postulated,  and  to  predict  what  will  be  the  consequences  of  applying 
any  specified  plan  of  breeding.  In  principle  we  should  then  be  able  to 
make  observations  of  the  genetic  properties  of  natural  or  experi- 
mental populations,  and  of  the  outcome  of  special  breeding  methods, 
and  deduce  from  these  observations  what  are  the  properties  of  the 
genes  concerned.  The  experimental  side  of  quantitative  genetics, 
however,  has  lagged  behind  the  theoretical  in  its  development,  and  it 
is  still  some  way  from  fulfilling  this  complementary  function.  The 
reason  for  this  is  the  difficulty  of  devising  diagnostic  experiments 
which  will  unambiguously  discriminate  between  the  many  possible 
situations  envisaged  by  the  theory.  Consequently  the  experimental 
side  has  developed  in  a  somewhat  empirical  manner,  building  general 
conclusions  out  of  the  experience  of  many  particular  cases.  Never- 
theless there  is  now  a  sufficient  body  of  experimental  data  to  substan- 
tiate the  theory  in  its  main  outlines;  to  allow  a  number  of  generalisa- 
tions to  be  made  about  the  inheritance  of  quantitative  differences; 
and  to  enable  us  to  predict  with  some  confidence  the  outcome  of 
certain  breeding  methods.  Discussion  of  all  the  difficulties  would  be 
inappropriate  in  an  introductory  treatment.  The  aim  here  is  to 
describe  all  that  is  reasonably  firmly  established  and,  for  the  sake  of 
clarity,  to  simplify  as  far  as  is  possible  without  being  misleading. 
Consequently  the  emphasis  is  on  the  theoretical  side.  Though  con- 
clusions will  often  be  drawn  directly  from  experimental  data,  the 
experimental  side  of  the  subject  is  presented  chiefly  in  the  form  of 
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examples,  chosen  with  the  purpose  of  illustrating  the  theoretical 
conclusions.  These  examples,  however,  cannot  always  be  taken  as 
substantiating  the  postulates  that  underlie  the  conclusions  they 
illustrate.  Too  often  the  results  of  experiments  are  open  to  more  than 
one  interpretation. 

No  attempt  has  been  made  to  give  exhaustive  references  to  pub- 
lished work  in  any  part  of  the  subject;  or  to  indicate  the  origins,  or 
trace  the  history,  of  the  ideas.  To  have  done  this  would  have  required 
a  much  longer  book,  and  a  considerable  sacrifice  of  clarity.  The  chief 
sources,  from  which  most  of  the  material  of  the  book  is  derived,  are 
listed  below.  These  sources  are  not  regularly  cited  in  the  text. 
References  are  given  in  the  text  when  any  conclusion  is  stated  without 
full  explanation  of  its  derivation.  These  references  are  not  always  to 
the  original  papers,  but  rather  to  the  more  recent  papers  where  the 
reader  will  find  a  convenient  point  of  entry  to  the  topic  under  dis- 
cussion. References  are  also  given  to  the  sources  of  experimental 
data,  but  these,  for  reasons  already  explained,  cover  only  a  small  part 
of  the  experimental  side  of  the  subject.  In  particular,  a  great  deal 
more  work  has  been  done  on  plants  and  on  farm  animals  than  would 
appear  from  its  representation  among  the  experimental  work  cited. 
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CHAPTER    i 

GENETIC   CONSTITUTION   OF  A 
POPULATION 

Frequencies  of  Genes  and  Genotypes 

To  describe  the  genetic  constitution  of  a  group  of  individuals  we 
should  have  to  specify  their  genotypes  and  say  how  many  of  each  geno- 
type there  were.  This  would  be  a  complete  description,  provided  the 
nature  of  the  phenotypic  differences  between  the  genotypes  did  not 
concern  us.  Suppose  for  simplicity  that  we  were  concerned  with  a 
certain  autosomal  locus,  A,  and  that  two  different  alleles  at  this  locus, 
Ax  and  A2,  were  present  among  the  individuals.  Then  there  would  be 
three  possible  genotypes,  AjA^  AjAa,  and  A2A2.  (We  are  concerned 
here,  as  throughout  the  book,  exclusively  with  diploid  organisms.) 
The  genetic  constitution  of  the  group  would  be  fully  described  by 
the  proportion,  or  percentage,  of  individuals  that  belonged  to  each 
genotype,  or  in  other  words  by  the  frequencies  of  the  three  genotypes 
among  the  individuals.  These  proportions  or  frequencies  are  called 
genotype  frequencies,  the  frequency  of  a  particular  genotype  being  its 
proportion  or  percentage  among  the  individuals.  If,  for  example,  we 
found  one  quarter  of  the  individuals  in  the  group  to  be  AjA^  the 
frequency  of  this  genotype  would  be  0-25,  or  25  per  cent.  Naturally 
the  frequencies  of  all  the  genotypes  together  must  add  up  to  unity,  or 
1 00  per  cent.         "  "  ""  " 

Example  i.i.  The  M-N  blood  groups  in  man  are  determined  by  two 
alleles  at  a  locus,  and  the  three  genotypes  correspond  with  the  three  blood 
groups,  M,  MN,  and  N.  The  following  figures,  taken  from  the  tabulation 
of  Mourant  (1954),  show  the  blood  group  frequencies  among  Eskimoes 
of  East  Greenland  and  among  Icelanders  as  follows: 


Frequency, 


Blood  group 

Number  of 
individuals 

M       MN       N 

Greenland 

83-5      15-6       0-9 

569 

Iceland 

31-2     51-5      17-3 

747 

6  GENETIC  CONSTITUTION  OF  A  POPULATION  [Chap.  I 

Clearly  the  two  populations  differ  in  these  genotype  frequencies,  the  N 
blood  group  being  rare  in  Greenland  and  relatively  common  in  Iceland. 
Not  only  is  this  locus  a  source  of  variation  within  each  of  the  two  popula- 
tions, but  it  is  also  a  source  of  genetic  difference  between  the  populations. 


A  population,  in  the  genetic  sense,  is  not  just  a  group  of  individuals, 
but  a  breeding  group;  and  the  genetics  of  a  population  is  concerned 
not  only  with  the  genetic  constitution  of  the  individuals  but  also  with 
the  transmission  of  the  genes  from  one  generation  to  the  next.  In  the 
transmission  the  genotypes  of  the  parents  are  broken  down  and  a  new 
set  of  genotypes  is  constituted  in  the  progeny,  from  the  genes  trans- 
mitted in  the  gametes.  The  genes  carried  by  the  population  thus  have 
continuity  from  generation  to  generation,  but  the  genotypes  in  which 
they  appear  do  not.  The  genetic  constitution  of  a  population,  refer- 
ring to  the  genes  it  carries,  is  described  by  the  array  of  gene  frequencies; 
that  is  by  specification  of  the  alleles  present  at  every  locus  and  the 
numbers  or  proportions  of  the  different  alleles  at  each  locus.  If, 
for  example,  Ax  is  an  allele  at  the  A  locus,  then  the  frequency  of  Ax 
genes,  or  the  gene  frequency  of  Alt  is  the  proportion  or  percent- 
age of  all  genes  at  this  locus  that  are  the  Ax  allele.  The  frequencies 
of  all  the  alleles  at  any  one  locus  must  add  up  to  unity,  or  ioo  per 
cent. 

The  gene  frequencies  at  a  particular  locus  among  a  group  of 
individuals  can  be  determined  from  a  knowledge  of  the  genotype 
frequencies.  To  take  a  hypothetical  example,  suppose  there  are  two 
alleles,  A±  and  A2,  and  we  classify  ioo  individuals  and  count  the 
numbers  in  each  genotype  as  follows: 

AjAi    AjA2    A2A2    Total 
Number  of  individuals        30         60         10        100 

Number  of  genes  <  .  ,  _    V200 

&  \A2  o         60         20  80J 

Each  individual  contains  two  genes,  so  we  have  counted  200  repre- 
sentatives of  the  genes  at  this  locus.  Each^p^jj^diyjdual_contains 
two  At  genes  and  each  AXA2  contains  one  Ax  gene.  So  there  are  120  Ax 
genes intne  sample,  and  80  A2  genes.  The  frequency  of  A±  is  there- 
fore 60  per  cent  or  o-6,  and  the  frequency  of  A2  is  40  per  cent  or  0-4. 
To  express  the  relationship  in  a  more  general  form,  let  the  frequencies 
of  genes  and  of  genotypes  be  as  follows: 
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Genes 

Genotypes 

A1    A2 

AjAj       AjA-2       x\2^2 

Frequencies 

P       9 

P         H         Q 

so  that  p+q=  i,  and  P  +  H+  Q  =  i.  Since  each  individual  contains 
two  genes,  the  frequency  of  Ax  genes  is  J(2P  +  H)}  and  the  relation- 
ship between  gene  frequency  and  genotype  frequency  among  the 
individuals  counted  is  as  follows: 


p=p- 

q=Q 


H 


Xi.x) 


Example  1.2.  To  illustrate  the  calculation  of  gene  frequencies  from 
genotype  frequencies  we  may  take  the  M-N  blood  group  frequencies  given 
in  Example  1 . 1 .  The  M  and  N  blood  groups  represent  the  two  homozygous 
genotypes  and  the  MN  group  the  heterozygote.  The  frequency  of  the  M 
gene  in  Greenland  is,  from  equation  1.1,  0-835  +2(0*156)  =  0-913,  and  the 
frequency  of  the  N  gene  is  0-009 +i(o- 156)  =  0-087,  tne  sum  °f  tne 
frequencies  being  i-ooo  as  it  should  be.  Doing  the  same  for  the  Iceland 
sample  we  find  the  following  gene  frequencies  in  the  two  populations,  ex- 
pressed now  as  percentages: 

Gene 


M 

N 

Greenland 

9!'3 

87 

Iceland 

57-0 

43 -° 

Thus  the  two  populations  differ  in  gene  frequency  as  well  as  in  genotype 
frequencies. 


The  genetic  properties  of  a  population  are  influenced  in  the  pro- 
cess of  transmission  of  genes  from  one.  generation  to  the  next  by  a 
number  of  agencies.  These  form  the  chief  subject-matter  of  the  next 
four  chapters,  but  we  may  briefly  review  them  here  in  order  to  have 
some  idea  of  what  factors  are  being  left  out  of  consideration  in  this 
chapter.  The  agencies  through  which  the  genetic  properties  of  a 
population  may  be  changed  are  these: 

Population  size.  The  genes  passed  from  one  generation  to  the 
next  are  a  sample  of  the  genes  in  the  parent  generation.  Therefore 
the  gene  frequencies  are  subject  to  sampling  variation  between  suc- 
cessive generations,  and  the  smaller  the  number  of  parents  the  greater 
is  the  sampling  variation.  The  effects  of  sampling  variation  will  be 
considered  in  Chapters  3-5,  and  meantime  we  shall  exclude  it  from 
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the  discussion  by  supposing  always  that  we  are  dealing  with  a  '  'large 
population,"  which  means  simply  one  in  which  sampling  variation  is 
so  small  as  to  be  negligible.  For  practical  purposes  a  "large  popula- 
tion" is  one  in  which  the  number  of  adult  individuals  is  in  the  hundreds 
rather  than  in  the  tens. 

Differences  of  fertility  and  viability.  Though  we  are  not  at 
present  concerned  with  the  phenotypic  effects  of  the  genes  under  dis- 
cussion, we  cannot  ignore  their  effects  on  fertility  and  viability,  be- 
cause these  influence  the  genetic  constitution  of  the  succeeding 
generation.  The  different  genotypes  among  the  parents  may  have 
different  fertilities,  and  if  they  do  they  will  contribute  unequally  to 
the  gametes  out  of  which  the  next  generation  is  formed.  In  this  way 
the  gene  frequency  may  be  changed  in  the  transmission.  Further, 
the  genotypes  among  the  newly  formed  zygotes  may  have  different 
survival  rates,  and  so  the  gene  frequencies  in  the  new  generation  may 
be  changed  by  the  time  the  individuals  are  adult  and  themselves 
become  parents.  These  processes  are  called  selection,  and  will  be 
described  in  Chapter  2.  Meanwhile  we  shall  suppose  they  are  not 
operating.  It  is  difficult  to  find  examples  of  genes  not  subject  to 
selection.  For  the  purpose  of  illustration,  however,  we  may  take  the 
human  blood-group  genes  since  the  selective  forces  acting  on  these 
are  probably  not  very  strong.  Genes  that  produce  a  mutant  pheno- 
type  which  is  abnormal  in  comparison  with  the  wild-type  are,  in 
contrast,  usually  subject  to  much  more  severe  selection. 

Migration  and  mutation.  The  gene  frequencies  in  the  popula- 
tion may  also  be  changed  by  immigration  of  individuals  from  another 
population,  and  by  gene  mutation.  These  processes  will  be  described 
in  Chapter  2,  and  at  this  stage  will  also  be  supposed  not  to  operate. 

Mating  system.  The  genotypes  in  the  progeny  are  determined 
by  the  union  of  the  gametes  in  pairs  to  form  zygotes,  and  the  union  of 
gametes  is  influenced  by  the  mating  of  the  parents.  So  the  genotype 
frequencies  in  the  offspring  generation  are  influenced  by  the  geno- 
types of  the  pairs  that  mate  in  the  parent  generation.  We  shall  at 
first  suppose  that  mating  is  at  random  with  respect  to  the  genotypes 
under  discussion.  Random  mating,  or  panmixia,  means  that  any 
individual  has  an  equal  chance  of  mating  with  any  other  individual  in 
the  population.  The  important  points  are  that  there  should  be  no 
special  tendency  for  mated  individuals  to  be  alike  in  genotype,  or  to 
be  related  to  each  other  by  ancestry.  If  a  population  covers  a  large 
geographic  area  individuals  inhabiting  the  same  locality  are  more 
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likely  to  mate  than  individuals  inhabiting  different  localities,  and  so 
the  mated  pairs  tend  to  be  related  by  ancestry.  A  widely  spread 
population  is  therefore  likely  to  be  subdivided  into  local  groups  and 
mating  is  random  only  within  the  groups.  The  properties  of  sub- 
divided populations  depend  on  the  size  of  the  local  groups,  and  will 
be  described  under  the  effects  of  population  size  in  Chapters  3-5. 


Hardy-Weinberg  Equilibrium 


In  a  lar^e  rajiploiiiamating^  population  both  _gene_frequencies  and 
per^ot^pe  frequencies  are  constant  from  generation  to  gene^^p-n.  in 
th^^tfifince  of  migration,  mutation  and  selection;  and  the  genotype 
frequencies  are  determined  by  the  gene  frequencies.  These  properties 
of  a  population  were  first  demonstrated  fry  Harfly  ancLhy,  Weinberg 
independently  in  iqo8,  and  are  generally  known  as  the  Hardy- 
Weinberg  Law.  (See  Stern,  1943,  where  a  translation  of  the  relevant 
part  of  Weinberg's  paper  will  be  found.)  Such  a  population  is  said 
to  be  in  Hardy-Weinberg  equilibrium.  Deduction  of  the  Hardy- 
Weinberg  Law  involves  three  steps:  (1)  from  the  parents  to  the 
gametes  they  produce;  (2)  from  the  union  of  the  gametes  to  the  geno- 
types in  the  zygotes  produced;  and  (3)  from  the  genotypes  of  the 
zygotes  to  the  gene  frequency  in  the  progeny  generation.  These  steps, 
in  detail,  are  as  follows: 

1 .  Let  the  parent  generation  have  gene  and  genotype  frequencies 
as  follows: 


P      9. 


P 


AXA2 
H 


A2A2 


Q 


Two  sorts  of  gametes  are  produced,  those  bearing  Ax  and  those  bear- 
ing A2.  The  frequencies  of  these  gametic  types  are  the  same  as  the 
gene  frequencies,  p  and  q,  in  the  generation  producing  them,  for  this 
reason:  AXAX  individuals  produce  only  A±  gametes,  and  AXA2  indi- 
viduals produce  equal  numbers  of  A±  and  A2  gametes  (provided,  of 
course,  there  is  no  anomaly  of  segregation).  So  the  frequency  of  A± 
gametes  produced  by  the  whole  population  is  P  +  \H,  which  by 
equation  j.j  is  the  gene  frequency  of  A±. 

2.  Random  mating  between  individuals  is  equivalent  to  random 
union  among  their  gametes.  We  can  think  of  a  pool  of  gametes  to 
which  all  the  individuals  contribute  equally;  zygotes  are  formed  by 
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random  union  between  pairs  of  gametes  from  the  pool.  The  genotype 
frequencies  among  the  zygotes  are  then  the  products  of  the  frequencies 
of  the  gametic  types  that  unite  to  produce  them.  The  genotype 
frequencies  among  the  progeny  produced  by  random  mating  can 
therefore  be  determined  simply  by  multiplying  the  frequencies  of  the 
gametic  types  as  shown  in  the  following  table: 


s  8- 


Female  gc 

imetes  and 

their  frequencies 

\ 

A2 

P 

9. 

AA 

AiA2 

A1 

P 

P2 

pq 

A1A2 

A2A2 

A2 

9 

pq 

q2 

We  need  not  distinguish  the  union  of  Ax  eggs  with  A2  sperms  from 
that  of  A2  eggs  with  A1  sperms;  so  the  genotype  frequencies  of  the 
zygotes  are 

AiAj_       A]A2 


A2A2 


zpq 


.(1.2) 


Note  that  these  genotype  frequencies  depend  only  on  the  gene  fre- 
quency in  the  parents,  and  not  on  the  parental  genotype  frequencies, 
provided  the  parents  mate  at  random. 

3.  Finally  we  use  these  genotype  frequencies  to  determine  the 
gene  frequency  in  the  offspring  generation.  Applying  equation  1.1 
we  find  the  gene  frequency  of  Ax  is  j>2  +  \  {zpq)  =p(p  +  q)  =p,  which  is 
the  same  as  in  the  parent  generation.  '  — 

The  properties  ot  appellation  with  respect  to  a  single  locus,  ex- 
pressed in  the  Hardy- Weinberg  law  and  demonstrated  above,  are 
these: 

^  (1)  A  large  random-mating  population,  in  the  absence  of  migra- 
tion, mutation,  and  selection,  is  stable  with  respect  to  both  gene  and 
genotype  frequencies:  there  is  no  inherent  tendency  for  its  genetic 
properties  to  change  from  generation  to  generation. 

(2)  The  genotype  frequencies  in  the  progeny  produced  by  random 
mating  among  the  parents  are  determined  solely  by  the  gene  fre- 
quencies among  the  parents.  Consequently: 
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(a)  a  population  in  Hardy- Weinberg  equilibrium  has  the  rela- 
tionship expressed  in  equation  1.2  between  the  gene  and 
genotype  frequencies  in  any  one  generation.  And, 

(b)  these  Hardy- Weinberg  genotype  frequencies  are  established 
by  one  generation  of  random  mating,  irrespective  of  the 
genotype  frequencies  among  the  parents. 
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Fig.  i.i.  Relationship  between  genotype  frequencies  and  gene 
frequency  for  two  alleles  in  a  population  in  Hardy- Weinberg 
equilibrium. 

We  shall  later  give  another  proof  of  the  Hardy- Weinberg  law  by 
a  different  method.  Let  us  now  first  illustrate  the  properties  of  a 
population  in  Hardy- Weinberg  equilibrium,  and  then  show  to  what 
uses  these  properties  can  be  put.  The  relationship  between  gene 
frequency  and  genotype  frequencies  expressed  in  equation  1.2  is 
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illustrated  graphically  in  Fig.  i.i,  which  shows  how  the  frequencies 
of  the  three  genotypes  for  a  locus  with  two  alleles  depend  on  the  gene 
frequency.  As  an  example  of  the  Hardy- Weinberg  genotype  fre- 
quencies we  shall  take  again  the  M-N  blood  groups  in  man. 

Example  1.3.  Race  and  Sanger  (1954)  quote  the  following  frequencies 
(%)  of  the  M-N  blood  groups  in  a  sample  of  1,279  English  people.  From 
the  observed  genotype  (i.e.  blood  group)  frequencies  we  can  calculate  the 
gene  frequencies  by  equation  1.1.  These  gene  frequencies  are  shown  on 
the  right. 

Blood  group  Gene 

M  MN  N  M  N 

Observed      28-38       49-57        22-05  53-165      46-835 

Expected      28-265      49-800     21-935 

Now  from  the  gene  frequencies  we  can  calculate  the  expected  Hardy- 
Weinberg  genotype  frequencies  by  equation  1.2,  and  we  find  that  the 
observed  frequencies  agree  very  closely  with  those  expected  for  a  popula- 1 
tion  in  Hardy- Weinberg  equilibrium. 

Comparison  of  observed  with  expected  genotype  frequencies  may 
be  regarded  as  a  test  of  the  fulfilment  of  the  conditions  on  which  the 
Hardy- Weinberg  equilibrium  depends.  ^Xhese  conditions  are: 
random  mating  among  the  parents  of  the  individuals  observed,  equal 
fertility  of  the  different  genotypes  among;  the  parents,  and  equal 
viability  of  the  different  genotypes  amnn^  the  nffoprjng  from  f^rtilisa- 
tion  up  to  the  time  of  observation.  In  addition,  the  classification  of 
individuals  as  to  genotype  must  have  been  correctly  made.  The 
blood  group  frequencies  in  Example  1.3  give  no  cause  to  doubt  the 
fulfilment  of  these  conditions.  It-should  be  noted,  however,  that  a 
difference  of  fertility  or  of  viability  between  the  genotypes,  though  it 
can  be  detected,  cannot  be  measured  from  a  comparison  of  observed 
v^ith^expected  frequencies  (Wallace,  1958).  The. expected  frequencies 
arej)ased  on  the  observed  gene  frequencies  after  the  differences  of  fer- 
ity or  viability  have  had  their  effect.  In  order  to  measure  these  effects 
wejshould  have  to  know  the  original  gene  or  genotype  frequencies. 

At  the  beginning  of  the  chapter  we  saw,  in  equation  J.  J,  how  the 
gene  frequencies  among  a  group  of  individuals  can  be  determined 
from  their  genotype  frequencies;  but  for  this  it  was  necessary  to  know 
the  frequencies  of  all  three  genotypes.  Consequently  the  relationship 
in  equation  1.1  cannot  be  applied  to  the  case  of  a  recessive  allele, 
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when  the  heterozygote  is  indistinguishable  from  the  dominant  homo- 
zygote.  Consideration  of  the  population  as  a  breeding  unit,  however, 
shows  that  when  the  conditions  for  Hardy- Weinberg  equilibrium 
hold,  only  the  frequency  of  one  of  the  homozygous  genotypes  is 
needed  to  determine  the  gene  frequency,  and  the  difficulty  of  recessive 
genes  is  thus  overcome.  Let  A2,  for  example,  be  a  recessive  gene 
with  frequency  q;  then  the  frequency  of  A2A2  homozygotes  is  q2.  In 
other  words  the  gene  frequency  is  the  square  root  of  the  homozygote 
frequency.  Thus  we  can  determine  the  gene  frequency  of  recessive 
abnormalities,  provided  that  selective  mortality  of  the  homozygote 
can  be  discounted  or  allowed  for.  But  we  can  go  further,  and  this  is 
often  the  more  important  point:  we  can  also  determine  the  frequency 
of  heterozygotes,  or  "carriers,"  of  recessive  abnormalities,  which  is  f 
2q(i  -q).  It  comes  as  a  surprise  to  most  people  to  discover  how  com-C-  J^ 
mem  heterozygotes  of  a  rare  recessive  abnormality  are. 


lL 


Example  1.4.  Albinism  in  man  is  probably  determined  by  a  single 
recessive  autosomal  gene,  and  the  frequency  of  albinos  is  about  1/20,000 
in  human  populations  (see  Stern,  1949).  If  q  is  the  frequency  of  the  albino 
gene,  then  q2  =  1/20,000,  and  q  =  1/141,  if  selective  mortality  is  disregarded. 
The  frequency  of  heterozygotes  is  then  2^(1  -q)y  which  works  out  to  about 
1/70.  So  about  one  person  in  seventy  is  a  heterozygote  for  albinism, 
though  only  one  in  twenty  thousand  is  a  homozygote. 

Example  1.5.  There  is  a  recessive  autosomal  gene  in  the  Ayrshire 
breed  of  cattle  in  Britain  which  causes  dropsy  in  the  new-born  calf.  The 
frequency  of  this  abnormality  is  about  1  in  300  births  (Donald,  Deas,  and 
Wilson,  1952).  A  means  of  reducing  the  frequency  of  the  defect  would 
obviously  be  the  avoidance  of  the  use  of  bulls  known  or  thought  to  be 
heterozygous.  We  might  first  want  to  know  what  proportion  of  bulls 
would  be  expected  to  be  heterozygotes.  In  this  case  the  conditions  for 
Hardy-Weinberg  equilibrium  are  certainly  not  all  fulfilled:  the  breed  is  not 
a  single  random-breeding  population,  and  the  abnormal  homozygotes  are 
not  fully  viable  up  to  the  time  of  birth.  So  we  can  only  get  a  rough  idea  of 
the  frequency  of  heterozygotes  by  assuming  the  observations  to  refer  to  a 
population  in  Hardy-Weinberg  equilibrium.  On  this  assumption, 
q2  =  0-0033,  so  tf  =  0'°57;  me  frequency  of  heterozygotes  is  zq{i  -q)  =  o-n. 
So  we  should  expect,  very  approximately,  one  bull  in  ten  to  be  a  hetero- 
zygote. 

Mating  frequencies  and  another  proof  of  the  Hardy- 
Weinberg  law.    Let  us  now  look  more  closely  into  the  breeding 
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structure  of  a  random-mating  population,  distinguishing  the  types  of 
mating  according  to  the  genotypes  of  the  pairs,  and  seeing  what  are 
the  genotype  frequencies  among  the  progenies  of  the  different  types 
of  mating.  This  provides  a  general  method  for  relating  genotype 
frequencies  in  successive  generations,  which  we  shall  use  in  a  later 
chapter.  It  also  provides  another  proof  of  the  Hardy- Weinberg  law; 
a  proof  more  cumbersome  than  that  already  given  but  showing  more 
clearly  how  the  Hardy- Weinberg  frequencies  arise  from  the  Men- 
delian  laws  of  segregation.  The  procedure  is  to  obtain  first  the 
frequencies  of  all  possible  mating  types  according  to  the  frequencies 
of  the  genotypes  among  the  parents,  and  then  to  obtain  the  fre- 
quencies of  genotypes  among  the  progeny  of  each  type  of  mating 
according  to  the  Mendelian  ratios. 

Consider  a  locus  with  two  alleles,  and  let  the  frequencies  of  genes 
and  genotypes  in  the  parents  be,  as  before, 

Genes  Genotypes 

A1      A2  -A-i-A-i       A1A2      A2A2 

Frequencies     p         q  P  H  Q 

There  are  altogether  nine  types  of  mating,  and  their  frequencies 
when  mating  is  random  are  found  thus: 


Q  ^  s 

S  ^a 

Since  the  sex  of  the  parent  is  irrelevant  in  this  context,  some  of  the 
types  of  mating  are  equivalent,  and  the  number  of  different  types 
reduces  to  six.  By  summation  of  the  frequencies  of  equivalent  types, 
we  obtain  the  frequencies  of  mating  types  in  the  first  two  columns  of 
Table  i .  i .  Now  we  have  to  consider  the  genotypes  of  offspring  pro- 
duced by  each  type  of  mating,  and  find  the/frequency  of  each  geno- 
type in  the  total  progeny,  assuming,  of  course,  that  all  types  of  mating 
are  equally  fertile  and  all  genotypes  equally  viable.  This  is  done  in 
the  right  hand  side  of  Table  i .  i .  Thus,  for  example,  matings  of  the 
type  AXAX  x  A^  produce  only  AXAX  offspring.    So,  of  all  the  A^ 
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genotypes  in  the  total  progeny,  a  proportion  P2  come  from  this  type 
of  mating.  Similarly  a  quarter  of  the  offspring  of  AXA2  x  AXA2 
matings  are  A^.  So  this  type  of  mating,  which  has  a  frequency  of 
H2y  contributes  a  proportion  \H2  of  the  total  A^  progeny.  To  find 
the  frequency  of  each  genotype  in  the  total  progeny  we  add  the 


Mating 


Table  i.i 

Genotype  and  frequency  of  progeny 


Type 


Frequency 


AA 


AiA2 


A„An 


■/x-^-fij  X  A-^/ij 

P2 

P2 



■ — 

Xil/il      X    XTL-f/lo 

zPH 

PH 

PH 

— 

A-jAj  x  A2A2 

2PQ 

— 

2PQ 

— 

AjA2  x  AjA2 

H* 

\H2 

w2 

iff2 

AXA2  x  A2A2 

zHQ 

—  ' 

HQ 

HQ 

A2A2  X  r\2r\.2 

Q2 

Sums 

— 

— 

Q2 

{P+Wf 

2{P  + 

WW  +  W) 

(Q  +  Wf 

= 

p* 

zpq 

f 

frequencies  contributed  by  each  type  of  mating.  The  sums,  after 
simplification,  are  given  at  the  foot  of  the  table,  and  from  the  identity 
given  in  equation  J.J  they  are  seen  to  be  equal  to  p2,  2pq,  and  q2. 
These  are  the  Hardy-Weinberg  equilibrium  frequencies,  and  we 
have  shown  that  they  are  attained  by  one  generation  of  random  mating, 
irrespective  of  the  genotype  frequencies  among  the  parents. 

Multiple  alleles.  Restriction  of  the  treatment  to  two  alleles  at  a 
locus  suffices  for  many  purposes.  If  we  are  interested  in  one 
particular  allele,  as  often  happens,  then  all  the  other  alleles  at  the 
locus  can  be  treated  as  one.  Formulation  of  the  situation  in  terms  of 
two  alleles  is  therefore  often  possible  even  if  there  are  in  fact  more 
than  two.  If  we  are  interested  in  more  than  one  allele  we  can  still,  if 
we  like,  treat  the  situation  as  a  two-allele  system  by  considering  each 
allele  in  turn  and  lumping  the  others  together.  But  the  treatment  can 
be  easily  extended  to  cover  more  than  two  alleles,  and  no  new  prin- 
ciple is  introduced.  In  general,  if  qx  and  q2  are  the  frequencies  of  any 
two  alleles,  Ax  and  A2,  of  a  multiple  series,  then  the  genotype  fre- 
quencies under  Hardy-Weinberg  equilibrium  are  as  follows  (Li, 

Genotype:       A^      AXA2      A2A2 
Frequency:        q2        2q±q2        q2 
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These  frequencies  are  also  attained  by  one  generation  of  random 
mating.  This  can  readily  be  seen  by  reducing  the  situation  to  a  two- 
allele  system,  and  considering  each  allele  in  turn.  Or  it  can  be 
proved,  though  somewhat  more  laboriously,  by  the  method  explained 
above  for  the  two-allele  system. 

Example  i.6.  The  ABO  blood  groups  in  man  are  determined  by  a 
series  of  allelic  genes.  For  the  purpose  of  illustration  we  shall  recognise 
three  alleles,  A,  B,  and  O,  and  show  how  the  gene  frequencies  can  be 
estimated  from  the  blood  group  frequencies.  Let  the  frequencies  of  the 
A,  B,  and  O  genes  be  p,  q,  and  r  respectively,  so  that  p+q  +  r=i.  The 
following  table  shows  (i)  the  genotypes,  (2)  the  blood  groups  (i.e.  pheno- 
types)  corresponding  to  the  different  genotypes,  (3)  the  expected  frequen- 
cies of  the  blood  groups  in  terms  of  p,  q,  and  r,  on  the  assumption  of 
Hardy- Weinberg  equilibrium,  (4)  observed  frequencies  of  blood  groups  in 
a  sample  of  190,177  United  Kingdom  airmen,  quoted  by  Race  and  Sanger 
(1954)- 

Genotype  AA  AO       BB  BO  00  AB 

Blood  group                A                 B  O  AB 

Frequency  (%) 

expected  p2  +  2pr  q2  +  zqr  r2  zpq 

observed  41 716          8-560  46-684  3*040 

Calculation  of  the  gene  frequencies  is  rather  more  complicated  than  with 
two  alleles.  The  following  is  the  simplest  method:  a  more  refined  method 
is  described  by  Ceppellini  et  al.  (1955).  FirsL  the  frequency  of  the  O  gene 
is  simply  the  square  roqf  of  the  frequency  of  t)ie._Q  group.  Next  it  will  be 
seen  that  the  sum  of  the  frequencies  of  the  B  and  O  groups  is  q2  +  zqr  +  r2  = 
(q  +  r)2  =  (i  -p)2.  So  p  =  1  -  J(B  +  O),  where  B  and  O  are  the  frequencies 
of  the  blood  groups  B  and  O.  In  the  same  way  q=i  -^/(A  +  0),  and  we 
have  seen  that  r  =  JO.  This  method  gives  the  following  gene  frequencies 
in  the  sample: 

A  gene:  ^  =  0-2567 
B  gene:  #  =  0-0598 
Ogene:     r  =  0-6833 


Total  0-9998 


As  a  result  of  sampling  errors  these  frequencies  do  not  add  up  exactly  to 
unity,  but  we  shall  not  trouble  to  make  an  adjustment  for  so  small  a  dis- 
crepancy. We  may  now  calculate  the  expected  frequency  of  the  AB  blood 
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group,  which  has  not  been  used  in  arriving  at  these  gene  frequencies,  and 
see  whether  the  observed  frequency  agrees  satisfactorily.  The  expected 
frequency  of  AB  from  estimates  of  p  and  q  is  3-070  per  cent,  which  is  in 
good  agreement  with  the  observed  frequency  of  3-040  percent.  (x2=z°'7> 
with  1  d.f.,  calculated  by  the  method  given  by  Race  and  Sanger.) 


Sex-linked  genes.  With  sex-linked  genes  the  situation  is  rather 
more  complex  than  with  autosomal  genes.  The  relationship  between 
gene  frequency  and  genotype  frequency  in  the  homogametic  sex  is 
the  same  as  with  an  autosomal  gene,  but  the  heterogametic  sex  has 
only  two  genotypes  and  each  individual  carries  only  one  gene  instead 
of  two.  For  this  reason  two-thirds  of  the  sex-linked  genes  in  the 
population  are  carried  by  tKeTibmogametic  sex  and  one-third  by  the 
heterogametic.  For  the  sake  of  brevity  we  shall  now  refer  to  the 
heterogametic  sex  as  male.  Consider  two  alleles,  Ax  and  A2,  with 
frequencies^)  and  q,  and  let  the  genotypic  frequencies  be  as  follows: 


Females 

AjAj     AjA2 
P  H 


A2A2 


Q 


Males 
Ax  A2 
R     S 


The  frequency  of  A1  among  the  females  is  then  pf  =P  +  \Hy  and  the 


frequency  among  the  males  is  pr 
whole  population  is 


R.    The  frequency  of  A±  in  the 


=  i(2pf+Pm) 

=  ±(2P  +  H  +  R) 


(1.3) 

-(14) 


Now,  if  the  gene  frequencies  among  males  and  among  females  are 
different,  the  population  is  not  in  equilibrium.  The  gene  frequency 
in  the  population  as  a  whole  does  not  change,  but  its  distribution 
between  the  two  sexes  oscillates  as  the  population  approaches  equili- 
brium. The  reason  for  this  can  be  seen  from  the  following  con- 
siderations. Males  get  their  sex-linked  genes  only  from  their 
mothers;  therefore  pm  is  equal  to  pf  in  the  previous  generation. 
Females  get  their  sex-linked  genes  equally  from  both  parents;  there- 


fore  pf  is  equal  to  the  mean  of  pm  and  pf  in  the  previous  generation, 
Using  primes  to  indicate  the  previous  generation,  we  have 


Pm=p'f 

Pf^Wm+P'f) 
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The  difference  between  the  frequencies  in  the  two  sexes  is 

Pf-pm  =  i(Pm  +Pf)-Pf 

=  -i(Pf-p'm) 

i.e.  half  the  difference  in  the  previous  generation,  but  in  the  other 
direction.  Therefore  the  distribution  of  the  genes  between  the  two 
sexes  oscillates,  but  the  difference  is  halved  in  successive  generations 
and  the  population  rapidly  approaches  an  equilibrium  in  which  the 
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Fig.  1.2.  Approach  to  equilibrium  under  random  mating  for  a 
sex-linked  gene,  showing  the  gene  frequency  among  females, 
among  males,  and  in  the  two  sexes  combined.  The  population 
starts  with  females  all  of  one  sort  (qf  —  i),  and  males  all  of  the 
other  sort  (qm=  o). 

frequencies  in  the  two  sexes  are  equal.  The  situation  is  illustrated 
in  Fig.  1.2,  which  shows  the  consequences  of  mixing  females  of  one 
sort  (all  AjAi)  with  males  of  another  sort  (all  A2)  and  letting  them 
breed  at  random. 

Example  1.7.    Searle  (1949)  gives  the  frequencies  of  a  number  of 
genes  in  a  sample  of  cats  in  London.  The  animals  examined  were  sent  to 
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clinics  for  destruction;  they  were  therefore  not  necessarily  a  random 
sample.  Among  the  genes  studied  was  ''yellow"  (y)  which  is  sex-linked 
and  for  which  all  three  genotypes  in  females  are  recognisable,  the  hetero- 
zygote  being  tortoise-shell.  The  data  were  used  to  test  for  agreement  with 
Hardy- Weinberg  equilibrium.  The  numbers  observed  in  each  phenotypic 
class  are  shown  in  table  (i).     We  may  first  see  whether  the  gene  frequency 


(i) 

Females 

+  +     +y    yy 

Numbers  observed    277        54        7 
Numbers  expected     269-6  64-5     3-9 


Males 


311 
3I5-2 


y 
42 

37-8 


(") 

+ 

y 

?y 

in  females 

608 

68 

o-ioi 

in  males 

311 

42 

0-119 

total 

919 

no 

0-107 

is  equal  in  the  two   sexes.     The  numbers   of  genes   counted,   and  the 
frequency  (q)  of  the  gene  y,  in  each  sex  are  as  given  in  table  (ii).    The 


J 


X2  testing  difference  in  q  between  the  sexes  is  0-4  which  is  quite  in- 
significant. There  is  therefore  no  reason  to  think  the  population  is  not 
in  equilibrium,  and  we  may  take  the  estimate  of  gene  frequency  from  both 
sexes  combined:  it  is  #  =  0-107.  From  this  estimate  of  q  the  expected 
numbers  in  the  different  phenotypic  classes  are  calculated;  they  are  shown 
in  table  (i).  Only  the  females  are  relevant  to  the  test  of  random  mating. 
The  x2  testing  agreement  between  observed  and  expected  numbers  in 
females  is  4-4,  with  2  degrees  of  freedom.  This  has  a  probability  of  o-i  and 
cannot  be  judged  significant.  The  data  are  therefore  compatible  with  the 
Hardy- Weinberg  equilibrium,  in  spite  of  the  deficiency  of  tortoise-shell 
females.  If  the  deficiency  of  heterozygous  females  were  real  we  might 
attribute  it  to  the  method  of  sampling  and  infer  that  the  tortoise-shells 
were  sent  for  destruction  less  often  than  the  other  colours,  on  account  of 
human  preference. 


More  than  one  locus.  The  attainment  of  the  equilibrium  in 
genotype  frequencies  after  one  generation  of  random  mating  is  true 
of  all  autosomal  loci  considered  separately.  But  it  is  not  true  of  the 
genotypes  with  respect  to  two  or  more  loci  considered  jointly.  To 
illustrate  the  point,  consider  a  population  made  up  of  equal  numbers 


¥ 
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of  A^B^  and  A2A2B2B2  individuals,  of  both  sexes.  The  gene 
frequency  at  both  loci  is  then  J,  and  if  the  individuals  mated  at  ran- 
dom only  three  out  of  the  nine  genotypes  would  appear  in  the  pro- 
geny; the  genotype  A1A1B2B2,  for  example,  would  be  absent  though 
its  frequency  in  an  equilibrium  population  would  be  yg-.  The  missing 
genotypes  appear  in  subsequent  generations,  but  not  immediately 
at  their  equilibrium  frequencies.  The  approach  to  equilibrium  is 
described  by  Li  (19550)  an<^  nere  we  snan  onry  outline  the  con- 
clusions. 

Consider  two  loci  each  with  two  alleles,  and  let  the  frequencies  of 
the  four  types  of  gamete  formed  by  the  initial  population  be  as  fol- 
lows: 

type  of  gamete      A1B1    AXB2    A2BX    A2B2 
frequency  r  s  t  u 

Then  if  the  population  is  in  equilibrium,  ru=st,  as  may  be  seen  by 
writing  the  gametic  frequencies  in  terms  of  the  gene  frequencies. 
The  difference,  ru  -  st,  gives  a  measure  of  the  extent  of  the  departure 
from  equilibrium.  This  difference  is  halved  in  each  successive  genera- 
tion of  random  mating,  and  the  approach  to  equilibrium  is  thus  fairly 
rapid  (see  Fig.  1.3).  If,  however,  more  than  two  loci  are  to  be  con- 
sidered jointly  the  approach  to  equilibrium  becomes  progressively 
slower  as  the  number  of  loci  increases. 

Linked  loci.  If  two  loci  are  linked  the  approach  to  equilibrium 
under  random  mating  is  slower  in  proportion  to  the  closeness  of  the 
linkage.  When  equilibrium  is  reached  the  coupling  and  repulsion 
phases  are  equally  frequent;  the  frequencies  of  the  gametic  types  then 
depend  only  on  the  gene  frequencies  and  not  at  all  on  the  linkage.  It 
is  easy  to  suppose  that  association  between  two  characters,  as  for 
example  between  hair  colour  and  eye  colour,  is  evidence  of  linkage 
between  the  genes  concerned.  Association  between  characters, 
however,  is  more  often  evidence  of  pleiotropy  than  of  linkage.  Link- 
age can  give  rise  to  association  only  after  a  mixture  of  populations, 
the  length  of  time  that  the  association  persists  depending  on  the 
closeness  of  the  linkage. 

The  approach  to  equilibrium  after  the  mixture  of  populations 
differing  in  respect  of  the  genes  at  two  linked  loci  can  be  described  in 
the  manner  of  the  preceding  section.  The  departure  from  equili- 
brium, d,  is  expressed  as  d  —  ru-st,  where  ru  is  the  frequency  of 
coupling  heterozygotes  and  st  that  of  repulsion  heterozygotes.    If  c 
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is  the  frequency  of  recombination  between  the  two  loci  then  the 
difference,  d,  at  generation  t  is 

dt  =  (i-c)dt_1 

Thus  if,  for  example,  there  is  25  per  cent  recombination  the  difference 
is  reduced  by  one  quarter  in  each  generation;  or  if  there  is  10  per  cent 
recombination  the  difference  is  reduced  by  10  per  cent  in  each 
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Fig.  1.3.  Approach  to  equilibrium  under  random  mating  of  two 
loci,  considered  jointly.  The  graphs  show  the  difference  of  fre- 
quency (d)  between  coupling  and  repulsion  heterozygotes  in  suc- 
cessive generations,  starting  with  all  individuals  repulsion  hetero- 
zygotes. The  five  graphs  refer  to  different  degrees  of  linkage 
between  the  two  loci,  as  indicated  by  the  recombination  frequency 
shown  alongside  each  graph.  The  graph  marked  .5  refers  to  un- 
linked loci. 


generation.  Closely  linked  loci  will  therefore  continue  for  a  consider- 
able time  to  show  the  effects  of  a  past  mixture  of  populations.  The 
approach  to  equality  of  coupling  and  repulsion  phases  with  different 
degrees  of  linkage  is  illustrated  in  Fig.  1.3. 
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Assortative  mating.  Assortative  mating  is  a  form  of  non-random 
mating,  but  this  is  the  most  convenient  place  to  mention  it.  If  the 
mated  pairs  tend  to  be  of  the  same  genotype  more  often  than  would 
occur  by  chance  this  is  called  positive  assortative  mating,  and  if  less 
often  it  is  called  negative  assortative  (or  sometimes  disassortative) 
mating.  The  consequences  are  described  by  Wright  (1921)  and  sum- 
marised by  Li  (1955^)  and  will  be  only  briefly  outlined  here.  Posi- 
tive assortative  mating  is  of  some  importance  in  human  populations, 
where  it  occurs  with  respect  to  intelligence  and  other  mental  charac- 
ters. These  however  are  not  single  gene  differences  such  as  can  be 
discussed  in  the  present  context.  The  consequences  of  assortative 
mating  with  a  single  locus  can  be  deduced  from  Table  1 . 1  by  appro- 
priate modification  of  the  frequencies  of  the  types  of  mating  to  allow 
for  the  increased  frequency  of  matings  between  like  genotypes.  The 
effect  on  the  genotype  frequencies  among  the  progeny  is  to  increase 
the  frequencies  of  homozygotes  and  reduce  that  of  heterozygotes. 
In  effect  the  population  becomes  partially  subdivided  into  two 
groups,  mating  taking  place  more  frequently  within  than  between 
the  groups. 


CHAPTER   2 

CHANGES   OF   GENE   FREQUENCY 

We  have  seen  that  a  large  random-mating  population  is  stable  with 
respect  to  gene  frequencies  and  genotype  frequencies,  in  the  absence 
of  agencies  tending  to  change  its  genetic  properties.  We  can  now 
proceed  to  a  study  of  the  agencies  through  which  changes  of  gene 
frequency,  and  consequently  of  genotype  frequencies,  are  brought 
about.  There  are  two  sorts  of  process:  systematic  processes,  which 
tend  to  change  the  gene  frequency  in  a  manner  predictable  both  in 
amount  and  in  direction;  and  the  dispersive  process,  which  arises  in 
small  populations  from  the  effects  of  sampling,  and  is  predictable  in 
amount  but  not  in  direction.  In  this  chapter  we  are  concerned  only 
with  the  systematic  processes,  and  we  shall  consider  only  large  random- 
mating  populations  in  order  to  exclude  the  dispersive  process  from 
the  picture.  There  are  three  systematic  processes:  migration,  mutation, 
and  selection.  We  shall  study  these  separately  at  first,  assuming  that 
only  one  process  is  operating  at  a  time,  and  then  we  shall  see  how  the 
different  processes  interact. 


Migration 

The  effect  of  migration  is  very  simply  dealt  with  and  need  not  con- 
cern us  much  here,  though  we  shall  have  more  to  say  about  it  later, 
in  connexion  with  small  populations.  Let  us  suppose  that  a  large 
population  consists  of  a  proportion,  m,  of  new  immigrants  in  each 
generation,  the  remainder,  i  -  m,  being  natives.  Let  the  frequency 
of  a  certain  gene  be  qm  among  the  immigrants  and  q0  among  the 
natives.  Then  the  frequency  of  the  gene  in  the  mixed  population,  qlf 
will  be 


mqm  +  (i-m)q0 


.(2.1) 


The  change  of  gene  frequency,  Aq,  brought  about  by  one  generation 


F.Q.G. 
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of  immigration  is  the  difference  between  the  frequency  before 
immigration  and  the  frequency  after  immigration.  Therefore 

=  m(qm-q0)  (2.2) 

Thus  the  rate  of  change  of  gene  frequency  in  a  population  subject  to 
immigration  depends,  as  must  be  obvious,  on  the  immigration  rate 
and  on  the  difference  of  gene  frequency  between  immigrants  and 
natives. 


Mutation 

The  effect  of  mutation  on  the  genetic  properties  of  the  population 
differs  according  to  whether  we  are  concerned  with  a  mutational 
event  so  rare  as  to  be  virtually  unique,  or  with  a  mutational  step  that 
recurs  repeatedly.  The  first  produces  no  permanent  change,  whereas 
the  second  does. 

3fe»  Non-recurrent  mutation.  Consider  first  a  mutational  event 
*mat  gives  rise  to  just  one  representative  of  the  mutated  gene  or 
chromosome  in  the  whole  population.  This  sort  of  mutation  is  of 
little  importance  as  a  cause  of  change  of  gene  frequency,  because  the 
product  of  a  unique  mutation  has  an  infinitely  small  chance  of  sur- 
viving in  a  large  population,  unless  it  has  a  selective  advantage.  This 
can  be  seen  from  the  following  consideration.  As  a  result  of  the  single 
mutation  there  will  be  one  AXA2  individual  in  a  population  all  the 
rest  of  which  is  AjA^  The  frequency  of  the  mutated  gene,  A2,  is 
therefore  extremely  low.  Now  according  to  the  Hardy- Weinberg 
equilibrium  the  gene  frequency  should  not  change  in  subsequent 
generations.  But  with  this  situation  we  can  no  longer  ignore  the 
variation  of  gene  frequency  due  to  sampling.  With  a  gene  at  very  low 
frequency  the  sampling  variation,  even  though  very  small,  may  take 
the  frequency  to  zero,  and  the  gene  will  then  be  lost  from  the  popu- 
lation. Though  at  each  generation  a  single  gene  has  an  equal  chance 
of  surviving  or  being  lost,  the  loss  is  permanent  and  the  probability 
of  the  gene  still  being  present  decreases  with  the  passage  of  genera- 
tions (see  Li,  1955a).  The  conclusion,  therefore,  is  that  a  unique 
mutation  without  selective  advantage  cannot  produce  a  permanent 
change  in  the  population. 

Recurrent  mutation.    It  is  with  the  second  type  of  mutation — 
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recurrent  mutation — that  we  are  concerned  as  an  agent  for  causing 
change  of  gene  frequency.  Each  mutational  event  recurs  regularly 
with  characteristic  frequency,  and  in  a  large  population  the  frequency 
of  a  mutant  gene  is  never  so  low  that  complete  loss  can  occur  from 
sampling.  We  have,  then,  to  find  out  what  is  the  effect  of  this  "pres- 
sure" of  mutation  on  the  gene  frequency  in  the  population. 

Suppose  gene  A^mutates  to  A^  with^aJrequencv  u  per  generation. 
(u  is  the  proportion  of  all  Ax  genes  that  mutate  to  A2  between  one 
generation  and  the  next.)  If  the  frequency  of  Ax  in  one  generation  is 
p0  the  frequency  of  newly  mutated  A2  genes  in  the  next  generation  is 
upQ.  So  the  new  gene  frequency  of  Ax  is  p0  -  up0,  and  the  change  of 
gene  frequency  is  -  up0.  Now  consider  what  happens  when  the  genes 
mutate  in  both  directions.  Suppose  for  simplicity  that  there  are  only 
two  alleles,  Ax  and  A2,  with  initial  frequencies  p0  and  q0.  Ax  mutates 
to  A2  at  a  rate  u  per  generation,  and  A2  mutates  to  Ax  at  a  rate  v. 
Then  after  one  generation  there  is  a  gain  of  A2  genes  equal  to  up0  due 
to  mutation  in  one  direction,  and  a  loss  equal  to  vq0  due  to  mutation 
in  the  other  direction.  Stated  in  symbols,  we  have  the  situation: 

u 

Mutation  rate  Ax  ^  A2 

V 

Initial  gene  frequencies       p0      q0 

Then  the  change  of  gene  frequency  in  one  generation  is 

Aq=up0-vq0 

It  is  easy  to  see  that  this  situation  leads  to  an  equilibrium  in  gene 
frequency  at  which  no  further  change  takes  place,  because  if  the 
frequency  of  one  allele  increases  fewer  of  the  other  are  left  to  mutate 
in  that  direction  and  more  are  available  to  mutate  in  the  other  direc- 
tion. The  point  of  equilibrium  can  be  found  by  equating  the  change 
of  frequency,  Aq,  to  zero.  Thus  at  equilibrium 

pu 
P. 


(*-3) 
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-qv 
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u  +  v 


(2.4) 


Three  conclusions  can  be  drawn  from  the  effect  of  mutation  on 
gene  frequency.  Measurements  of  mutation  rates  indicate  values 
ranging  between  about  io~4  and  io-8  per  generation  (one  in  ten 
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thousand  and  one  in  a  hundred  million  gametes).  With  normal 
mutation  rates,  therefore,  mutation  alone  can  produce  only  very  slow 
changes  of  gene  frequency;  on  an  evolutionary  time-scale  they  might 
be  important,  but  they  could  scarcely  be  detected  by  experiment 
unless  with  micro-organisms.  The  second  conclusion  concerns  the 
equilibrium  between  mutation  in  the  two  directions.  Studies  of 
reverse  mutation  (from  mutant  to  wild  type)  indicate  that  it  is  usually 
less  frequent  than  forward  mutation  (from  wild  type  to  mutant),  on 
the  whole  about  one  tenth  as  frequent  (Muller  and  Oster,  1957). 
The  equilibrium  gene  frequencies  for  such  loci,  resulting  from 
mutation  alone,  would  therefore  be  about  o-i  of  the  wild-type  allele 
and  0-9  of  the  mutant;  in  other  words  the  "mutant"  would  be  the 
common  form  and  the  "wild  type"  the  rare  form.  Since  this  is  not 
the  situation  we  find  in  natural  populations  it  is  clear  that  the  fre- 
quencies of  such  genes  are  not  the  product  of  mutation  alone.  We 
shall  see  in  the  next  section  that  the  rarity  of  mutant  alleles  is  attribu- 
table to  selection.  The  third  conclusion  concerns  the  effects  of  an 
increase  of  mutation  rates  such  as  might  be  caused  by  an  increase  of 
the  level  of  ionising  radiation  to  which  the  population  is  subjected. 
Any  loci  at  which  the  gene  frequencies  are  in  equilibrium  from  the 
effects  of  mutation  alone  will  not  be  affected  by  a  change  of  mutation 
rate,  provided  the  change  affects  forward  and  reverse  mutation  pro- 
portionately. This  can  be  seen  from  consideration  of  the  equilibrium 
gene  frequencies  given  in  equation  2.4. 


Selection 

Hitherto  we  have  supposed  that  all  individuals  in  the  population 
contribute  equally  to  the  next  generation.  Now  we  must  take  account 
of  the  fact  that  individuals  differ  in  viability  and  fertility,  and  that 
they  therefore  contribute  different  numbers  of  offspring  to  the  next 
generation.  The  proportionate  contribution  of  offspring  to  the  next 
generation  is  called  the  fitness  of  the  individual,  or  sometimes  the 
adaptive  value,  or  selective  value.  If  the  differences  of  fitness  are  in 
any  way  associated  with  the  presence  or  absence  of  a  particular  gene 
in  the  individual's  genotype,  then  selection  operates  on  that  gene. 
When  a  gene  is  subject  to  selection  its  frequency  in  the  offspring  is 
not  the  same  as  in  the  parents,  since  parents  of  different  genotypes 
pass  on  their  genes  unequally  to  the  next  generation.    In  this  way 
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selection  causes  a  change  of  gene  frequency,  and  consequently  also  of 
genotype  frequency.  The  change  of  gene  frequency  resulting  from 
selection  is  more  complicated  to  describe  than  that  resulting  from 
mutation,  because  the  differences  of  fitness  that  give  rise  to  the 
selection  are  an  aspect  of  the  phenotype.  We  therefore  have  to  take 
account  of  the  degree  of  dominance  shown  by  the  genes  in  question. 
Dominance,  in  this  connexion,  means  dominance  with  respect  to 
fitness,  and  this  is  not  necessarily  the  same  as  the  dominance  with 
respect  to  the  main  visible  effects  of  the  gene.  Most  mutant  genes,  for 
example,  are  completely  recessive  to  the  wild  type  in  their  visible 
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Fig.  2.i.     Degrees  of  dominance  with  respect  to  fitness. 

effects,  but  this  does  not  necessarily  mean  that  the  heterozygote  has  a 
fitness  equal  to  that  of  the  wild-type  homozygote.  The  meaning  of 
the  different  degrees  of  dominance  with  which  we  shall  deal  is 
illustrated  in  Fig.  2.1. 

It  is  most  convenient  to  think  of  selection  acting  against  the  gene 
in  question,  in  the  form  of  selective  elimination  of  one  or  other  of  the 
genotypes  that  carry  it.  This  may  operate  either  through  reduced 
viability  or  through  reduced  fertility  in  its  widest  sense,  including 
mating  ability.  In  either  case  the  outcome  is  the  same:  the  genotype 
selected  against  makes  a  smaller  contribution  of  gametes  to  form 
zygotes  in  the  next  generation.  We  may  therefore  treat  the  change  of 
gene  frequency  as  taking  place  between  the  counting  of  genotypes 
among  the  zygotes  of  the  parent  generation  and  the  formation  of 
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zygotes  in  the  offspring  generation.  The  intensity  of  the  selection  is 
expressed  as  the  coefficient  of  selection,  s,  which  is  the  proportionate 
reduction  in  the  gametic  contribution  of  a  particular  genotype  com- 
pared with  a  standard  genotype,  usually  the  most  favoured.  The 
contribution  of  the  favoured  genotype  is  taken  to  be  i,  and  the 
contribution  of  the  genotype  selected  against  is  then  i  -  s.  This 
expresses  the  fitness  of  one  genotype  compared  with  the  other.  Sup- 
pose, for  example,  that  the  coefficient  of  selection  is  s  =  o-i;  this 
means  that  for  every  ioo  zygotes  produced  by  the  favoured  genotype, 
only  90  are  produced  by  the  genotype  selected  against. 

The  fitness  of  a  genotype  with  respect  to  any  particular  locus  is 
not  necessarily  the  same  in  all  individuals.  It  depends  on  the  en- 
vironmental circumstances  in  which  the  individual  lives,  and  also  on 
the  genotype  with  respect  to  genes  at  other  loci.  When  we  assign  a 
certain  fitness  to  a  genotype,  this  refers  to  the  average  fitness  in  the 
whole  population.  Though  differences  of  fitness  between  individuals 
result  in  selection  being  applied  to  many,  perhaps  to  all,  loci  simul- 
taneously, we  shall  limit  our  attention  here  to  the  effects  of  selection 
on  the  genes  at  a  single  locus,  supposing  that  the  average  fitness  of  the 
different  genotypes  remains  constant  despite  the  changes  resulting 
from  selection  applied  simultaneously  to  other  loci.  The  conclusions 
we  shall  reach  apply  equally  to  natural  selection  occurring  under 
natural  conditions  without  the  intervention  of  man,  and  to  artificial 
selection  imposed  by  the  breeder  or  experimenter  through  his  choice 
of  individuals  as  parents  and  through  the  number  of  offspring  he 
chooses  to  rear  from  each  parent. 


Change  of  gene  frequency  under  selection.  We  have  first  to 
derive  the  basic  formulae  for  the  change  of  gene  frequency  brought 
about  by  one  generation  of  selection.  Then  we  can  consider  what  they 
tell  us  about  the  effectiveness  of  selection.  The  different  conditions 
of  dominance  have  to  be  taken  account  of,  but  the  method  is  the  same 
for  all,  and  we  shall  illustrate  it  by  reference  to  the  case  of  complete 
dominance  with  selection  acting  against  the  recessive  homozygote. 
Let  the  genes  Ax  and  A2  have  initial  frequencies  p  and  q,  Ax  being 
completely  dominant  to  A2,  and  let  the  coefficient  of  selection  against 
A2A2  individuals  be  s.  Multiplying  the  initial  frequency  by  the  fitness 
of  each  genotype  we  obtain  the  proportionate  contribution  of  each 
genotype  to  the  gametes  that  will  form  the  next  generation,  thus: 
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Genotypes 

Initial  frequencies 

Fitness 

Gametic  contribution 
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zpq 

i 
2pq 
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q*(i-s) 


Total 

i 

i  -sq2 


Note  that  the  total  gametic  contribution  is  no  longer  unity,  because 
there  has  been  a  proportionate  loss  of  sq2  due  to  the  selection.  To 
find  the  frequency  of  A2  gametes  produced — and  so  the  frequency  of 
A2  genes  in  the  progeny — we  take  the  gametic  contribution  of  A2A2 
individuals  plus  half  that  of  AXA2  individuals  and  divide  by  the  new 
total,  i.e.  we  apply  equation  J.J.  Thus  the  new  gene  frequency  is 


■(2-5) 


_q\i-s)+pq 
qi~       l-sq* 

The  change  of  gene  frequency,  Aq,  resulting  from  one  generation  of 
selection  is 

_g%  -%&pq   n 


sq 


which  on  simplification  reduces  to 


Aq  = 


^2(l~g) 
i  -sq2 


(2.6) 


From  this  we  see  that  the  effect  of  selection  on  gene  frequency  de- 
pends not  only  on  the  intensity  of  selection,  s,  but  also  on  the  initial 
gene  frequency.  But  both  relationships  are  somewhat  complex,  and 
the  examination  of  their  significance  will  be  postponed  till  after  the 
other  situations  have  been  dealt  with. 

Selection  may  act  against  the  dominant  phenotype  and  favour  the 
recessive:  we  then  put  i  -  s  for  the  fitness  of  A^  and  of  AXA2  geno- 
types. The  expression  for  Aq  is  given  in  Table  2.1.  The  difference 
may  best  be  appreciated  by  considering  the  effects  of  total  elimination 
(s  =  i).  The  expression  for  selection  against  the  dominant  allele  then 
reduces  to  Aq  =  1  -  q,  which  expresses  the  fact  that  if  only  the  reces- 
sive genotype  survives  to  breed  the  frequency  of  the  recessive  allele 
will  become  1  after  a  single  generation  of  selection.  But,  on  the  other 
hand,  if  there  is  complete  elimination  of  the  recessive  genotype  the 
frequency  of  the  dominant  allele  does  not  reach  1  after  a  single 
generation.  The  difference  between  the  effects  of  selection  in  oppo- 
site directions  becomes  less  marked  as  the  value  of  s  decreases. 
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If  there  is  incomplete  dominance  the  expression  for  Aq  is  again 
different.  The  case  of  exact  intermediate  dominance  is  given  in 
Table  2.1.  Here  we  put  1  -  %s  for  the  fitness  of  AxK2y  and  1  -s  for 
the  fitness  of  A2A2  genotype.  For  selection  in  the  opposite  direction 
in  this  case  we  need  only  interchange  the  initial  frequencies  of  the 
two  alleles,  writings  in  the  place  of  q. 

Table  2.1 

Change  of  gene  frequency,  Aq,  after  one  generation  of  selection 
under  different  conditions  of  dominance  specified  in  Fig.  2.1. 
Conditions  of  domin-  Initial  frequencies  and  Change  of  frequency, 

ance  and  selection  fitness  of  the  genotypes  Aq,  of  gene  A2 

A]AX   AjA2   A2A2 

p2  2pq  q2 

No  dominance  ,  had  -a) 

,      .  .        A  1         i-is       i-s  — =-^ ^         (1) 

selection  against  A2  1  -  sq  ' 


Complete  dominance  ^(i  -q) 

selection  against  A2A2  1  -sq2 

Complete  dominance  sq2(i  -q) 

selection  against  Ax  -  i-s(i-q2) 

Overdominance 

selection  against  1  -  s1  1  1  -  ^2  +  p^  lP — —  (4) 


AjAi  and  A2A2 


i-hp'-w 


When  s  is  small  the  denominators  differ  little  from  1,  and  the  numerators 
alone  can  be  taken  to  represent  Aq  sufficiently  accurately  for  most  purposes. 

Finally,  selection  may  favour  the  heterozygote,  a  condition  known 
as  overdominance.  In  this  case  we  put  1  -  s±  and  1  -s2  for  the  fitness 
of  the  two  homozygotes.  The  expression  for  Aq  is  given  in  Table  2. 1 . 
This  special  case  will  be  given  more  detailed  attention  later.  The 
different  conditions  of  dominance  to  which  the  expressions  in 
Table  2.1  refer  are  illustrated  diagrammatically  in  Fig.  2.1.  Let  us 
now  see  what  these  equations  tell  us  about  the  effectiveness  of  selec- 
tion. 

Effectiveness  of  selection.  We  see  from  the  formulae  that  the 
effectiveness  of  selection,  i.e.  the  magnitude  of  Aq,  depends  on  the 
initial  gene  frequency,  q.  The  nature  of  this  relationship  is  best 
appreciated  from  graphs  showing  Aq  at  different  values  of  q.   Fig.  2.2 
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Fig.  2.2.  Change  of  gene  frequency,  Aq,  under  selection  of  intensity  s  =o-2,  at 
different  values  of  initial  gene  frequency,  q.  Upper  figure:  a  gene  with  no  domi- 
nance. Lower  figure:  a  gene  with  complete  dominance.  The  graphs  marked 
(  -)  refer  to  selection  against  the  gene  whose  frequency  is  q,  so  that  Aq  is  nega- 
tive. The  graphs  marked  (  +)  refer  to  selection  in  favour  of  the  gene,  so  that 
Aq  is  positive.  (From  Falconer,  1954a;  reproduced  by  courtesy  of  the  editor  of 
the  International  Union  of  Biological  Sciences.) 
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shows  these  graphs  for  the  cases  of  no  dominance  and  complete 
dominance.  They  also  distinguish  between  selection  in  the  two 
directions.  A  value  of  s  =  o-2  was  chosen  for  the  coefficient  of  selec- 
tion because,  for  reasons  given  in  Chapter  12,  this  seems  to  be  the 
right  order  of  magnitude  for  the  coefficient  of  selection  operating  on 
genes  concerned  with  metric  characters  in  laboratory  selection  experi- 
ments. First  we  may  note  that  with  this  value  of  s  there  is  never  a 
great  difference  in  Aq  according  to  the  direction  of  selection.  The 
two  important  points  about  the  effectiveness  of  selection  that  these 
graphs  demonstrate  are:  (i)  Selection  is  most  effective  at  intermediate 
gene  frequencies  and  becomes  least  effective  when  q  is  either  large  or 
small,  (ii)  Selection  for  or  against  a  recessive  gene  is  extremely 
ineffective  when  the  recessive  allele  is  rare.  This  is  the  consequence 
of  the  fact,  noted  earlier,  that  when  a  gene  is  rare  it  is  represented 
almost  entirely  in  heterozygotes. 

Another  way  of  looking  at  the  effect  of  the  initial  gene  frequency  on 
the  effectiveness  of  selection  is  to  plot  a  graph  showing  the  course  of 
selection  over  a  number  of  generations,  starting  from  one  or  other 
extreme.  Such  graphs  are  shown  in  Fig.  2.3.  They  were  constructed 
directly  from  those  of  Fig.  2.2,  and  refer  again  to  a  coefficient  of 
selection,  s  =  o-z.  They  show  that  the  change  due  to  selection  is  at 
first  very  slow,  whether  one  starts  from  a  high  or  a  low  initial  gene 
frequency;  it  becomes  more  rapid  at  intermediate  frequencies  and 
falls  off  again  at  the  end.  In  the  case  of  a  fully  dominant  gene  one  is 
chiefly  interested  in  the  frequency  of  the  homozygous  recessive 
genotype,  i.e.  q2.  For  this  reason  the  graph  shows  the  effect  of  selec- 
tion on  q2  instead  of  on  q. 

It  is  often  useful  to  express  the  change  of  gene  frequency,  Aq, 
under  selection  in  a  simplified  form,  which  is  a  sufficiently  good 
approximation  for  many  purposes.  If  either  the  coefficient  of  selec- 
tion, sy  or  the  gene  frequency,  q,  is  small,  then  the  denominators  of 
the  equations  in  Table  2.1  become  very  nearly  unity,  and  we  can  use 
the  numerators  alone  as  expressions  for  Aq.  Then  for  selection  in 
either  direction  we  have,  with  no  dominance: 

Aq=±isq(i-q)     (approx.)  (2.7) 

and  with  complete  dominance: 

Aq=  ±sq2(i-q)     (approx.)  (2.8) 
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Fig.  2.3.  Change  of  gene  frequency  during  the  course  of  selection  from  one 
extreme  to  the  other.  Intensity  of  selection,  s  —0-2.  Upper  figure:  a  gene  with 
no  dominance.  Lower  figure:  a  gene  with  complete  dominance,  q  being  the 
frequency  of  the  recessive  allele  and  q2  that  of  the  recessive  homozygote.  The 
graphs  marked  (  - )  refer  to  selection  against  the  gene  whose  frequency  is  q,  so 
that  q  or  q2  decreases.  The  graphs  marked  (  + )  refer  to  selection  in  favour  of  the 
gene,  so  that  q  or  q2  increases.  (From  Falconer,  1954a;  reproduced  by  courtesy 
of  the  editor  of  the  International  Union  of  Biological  Sciences.) 
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Example  2.1.  As  an  example  of  the  change  of  gene  frequency  under 
selection  we  shall  take  the  case  of  a  sex-linked  gene,  in  spite  of  the  added 
complication,  because  there  is  no  well  documented  case  of  an  autosomal 
gene.  Fig.  2.4  shows  the  change  of  the  frequency  of  the  recessive  sex- 
linked  gene  "raspberry"  in  Drosophila  melanogaster  over  a  period  of  about 
eighteen  generations,  described  by  Merrell  (1953).  The  population  was 
started  with  a  gene  frequency  of  0-5  in  both  sexes,  and  was  therefore  in 


Generations 
Days 

Fig.  2.4.  Change  of  gene  frequency  under  natural  selection  in 
the  laboratory,  as  described  in  Example  2.1.  (Data  from  Merrell, 
1953.) 

equilibrium  at  the  beginning  (see  p.  17).  Counts  were  made  at  about 
monthly  intervals,  and  the  gene  frequency  in  both  sexes  combined  (by 
equation  1.3)  is  shown  against  the  scale  of  days  in  the  figure.  Measure- 
ments of  fitness  were  made  by  comparison  of  the  relative  viability  of 
mutant  and  wild-type  phenotypes,  and  of  their  relative  success  in  mating. 
No  differences  of  viability  were  detected,  nor  of  the  success  of  females  in 
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mating.  But  mutant  males  were  only  50  per  cent  as  successful  as  wild- 
type  males  in  mating.  The  changes  of  gene  frequency  expected  on  the 
basis  of  this  difference  of  fitness  were  then  calculated  generation  by 
generation,  and  these  calculated  values  are  shown  in  the  figure  by  the 
smooth  curve,  plotted  against  the  scale  of  generations.  From  a  similar 
experiment  with  a  different  mutant  it  was  found  that  the  calculated  and 
observed  curves  coincided  if  a  period  of  24  days  was  taken  as  the  interval 
between  generations.  For  this  reason  24  days  to  a  generation  was  taken  as 
the  basis  for  superimposing  the  curves  shown  here.  Since  the  calculated 
curve  was  to  this  extent  made  to  fit  the  observed,  the  good  agreement 
between  the  two  cannot  be  taken  as  proof  that  selection  operated  only 
through  the  males'  success  in  mating.  But  the  similarity  in  their  shapes 
illustrates  well  how  the  change  of  gene  frequency  is  rapid  at  first,  tails  off 
as  the  gene  frequency  becomes  lower,  and  becomes  very  slow  when  it 
approaches  zero. 


Number  of  generations  required.  How  many  generations  of 
selection  would  be  needed  to  effect  a  specified  change  of  gene  fre- 
quency? An  answer  to  this  question  is  sometimes  required  in  con- 
nexion with  breeding  programmes  or  proposed  eugenic  measures. 
We  shall  here  consider  only  the  case  of  selection  against  a  recessive 
when  elimination  of  the  unwanted  homozygote  is  complete,  i.e.  s=i. 
This  would  apply  to  natural  selection  against  a  recessive  lethal,  and 
artificial  selection  against  an  unwanted  recessive  in  a  breeding  pro- 
gramme. We  shall  also,  for  the  moment,  suppose  that  there  is  no 
mutation.  We  had  in  equation  2.5  an  expression  for  the  new  gene 
frequency  after  one  generation  of  selection  against  a  recessive. 
Substituting  s  =  1  in  this  equation  and  writing  q0,  qly  q2,  ...  ,qt  for  the 
gene  frequency  after  o,  1,  2,  . . .  ,  t  generations  of  selection  we  have 


go 


and 


?2 


i+go 

gl 

!+gl 

go 

by  substituting  for  q1  and  simplifying.  So  in  general 

go 


g< 


tq0 


(2.9) 
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and  the  number  of  generations,  t,  required  to  change  the  gene 
frequency  from  q0  to  qt  is 

tJhzli 

11  /  X 

= (2.10) 

Qt    q0 

We  may  use  this  formula  to  illustrate  the  point  already  made,  that 
when  the  frequency  of  a  recessive  gene  is  low  selection  is  very  slow 
to  change  it. 

Example  2.2.  It  is  sometimes  suggested,  as  a  eugenic  measure,  that 
those  suffering  from  serious  inherited  defects  should  be  prevented  from 
reproducing,  since  in  this  way  the  frequency  of  such  defects  would  be 
reduced  in  future  generations.  Before  deciding  whether  the  proposal  is  a 
good  one  we  ought  to  know  what  it  would  be  expected  to  achieve.  We 
cannot  properly  discuss  this  problem  without  taking  mutation  into  ac- 
count, as  we  shall  do  later;  the  answer  we  get  ignoring  mutation,  as  we  do 
now,  shows  what  is  the  best  that  could  be  hoped  for.  Let  us  take  albinism 
as  an  example,  though  it  cannot  be  regarded  as  a  very  serious  defect,  and 
ask  the  question:  how  long  would  it  take  to  reduce  its  frequency  to  half  the 
present  value?  The  present  frequency  is  about  1/20,000,  and  this  makes 
q0  =  1/141,  as  we  saw  in  Example  1.4.  The  objective  is q2  =  1/40,000,  which 
makes  qt  =  1/200.  So,  from  equation  2. io,  t  =  zoo  -  141  =59  generations. 
With  25  years  to  a  generation  it  would  take  nearly  1500  years  to  achieve 
this  modest  objective.  More  serious  recessive  defects  are  generally  even 
less  common  than  albinism  and  with  them  elimination  would  be  still 
slower. 


Balance  between  mutation  and  selection.  Having  described 
the  effects  of  mutation  and  selection  separately  we  must  now  compare 
them  and  consider  them  jointly.  Which  is  the  more  effective  process 
in  causing  change  of  gene  frequency?  Is  it  reasonable  to  attribute  the 
low  frequency  of  deleterious  genes  that  we  find  in  natural  popula- 
tions to  the  balance  between  mutation  tending  to  increase  the  fre- 
quency and  selection  tending  to  decrease  it?  The  expressions  already 
obtained  for  the  change  of  gene  frequency  under  mutation  or  selec- 
tion alone  show  that  both  depend  on  the  initial  gene  frequency,  but  in 
different  ways.  Mutation  to  a  particular  gene  is  most  effective  in 
increasing  its  frequency  when  the  mutant  gene  is  rare  (because  there 
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are  more  of  the  unmutated  genes  to  mutate);  but  selection  is  least 
effective  when  the  gene  is  rare.  The  relative  effectiveness  of  the  two 
processes  depends  therefore  on  the  gene  frequency,  and  if  both  pro- 
cesses operate  for  long  enough  a  state  of  equilibrium  will  eventually 
be  reached.  So  we  must  find  what  the  gene  frequency  will  be  when 
equilibrium  is  reached.  This  is  done  by  equating  the  two  expressions 
for  the  change  of  gene  frequency,  because  at  equilibrium  the  change 
due  to  mutation  will  be  equal  and  opposite  to  the  change  due  to 
selection. 

Let  us  consider  first  a  fully  ]  recessive  gene  with  frequency  q> 
mutation  rate  to  it  «,  and  from  it  v\  and  selection  coefficient  against  it 


s.  Then  from  equations  {2.3)  and!  (2.6)  we  have  at  equilibrium 


u{i-q)i- 


lsf{i-q) 


sq¥ 


.(2.11) 


This  equation  is  too  complicated  to  give  a  clear  answer  to  our  ques- 
tion. But  we  can  make  two  simplifications  with  only  a  trivial  sacrifice 
of  accuracy.  We  are  specifically  interested  in  genes  at  low  equilibrium 
frequencies.  If  q  is  small  the  term  vq  representing  back  mutation  is 
relatively  unimportant  and  can  be  neglected;  and  we  can  use  the 
approximate  expression  (equation  2.8)  for  the  selection  effect. 
Making  these  simplifications  we  have  the  equilibrium  condition  for 
selection  against  a  recessive  gene 

u(i  ~q)=sq2(i  -q)     (approx.) 


u  =  sqd 


r- 


j 


(approx.)      (2.12) 

(approx.)      (2.13) 


For  a  gene  with  no  dominance  similar  reasoning  from  equation  (1) 
in  Table  2.1  gives  the  equilibrium  condition 


q=-     (approx.) 


(2.14) 


Finally,  consider  selection  against  a  completely  dominant  gene,  the 
frequency  of  the  dominant  gene  being  1  -  q,  and  the  mutation  rate 
to  it  being  v.   In  this  case  1  -q  is  very  small  and  the  term  w(i  -q)  in 
ere   equation  2. 11  is  negligible.  We  have  therefore  at  equilibrium 
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vq  =  sq2(i  -q)     (approx.) 

q(i-q)=j  (approx.) 

or  H=—  (approx.)      (2-J5) 

where  H  is  the  frequency  of  heterozygotes.  If  the  mutant  gene  is 
rare  H  is  very  nearly  the  frequency  of  the  mutant  phenotype  in  the 
population. 

Example  2.3.  If  the  equilibrium  state  is  accepted  as  applicable,  we 
can  use  it  to  get  an  estimate  of  the  mutation  rate  of  dominant  abnormalities 
for  which  the  coefficient  of  selection  is  known.  Among  some  human 
examples  described  by  Haldane  (1949)  is  the  case  of  dominant  dwarfism 
(chondrodystrophy)  studied  in  Denmark.  The  frequency  of  dwarfs  was 
estimated  at  10-7  x  io-5,  and  their  fitness  (1  -s)  at  0-196.  The  estimate  of 
fitness  was  made  from  the  number  of  children  produced  by  dwarfs  com- 
pared with  their  normal  sibs.  The  mutation  rate,  by  equation  (2. 75), 
comes  out  at  4-3  x  io-5.  Though  there  is  a  possibility  of  serious  error  in 
the  estimate  of  frequency  owing  to  prenatal  mortality  of  dwarfs,  the 
mutation  rate  is  almost  certainly  estimated  within  the  right  order  of  magni- 
tude. For  a  discussion  of  the  estimation  of  mutation  rates  in  man  see 
Crow  (1956). 

These  expressions  for  the  equilibrium  gene  frequency  under  the 
joint  action  of  mutation  and  selection  show  that  the  gene  frequency 
can  have  any  value  at  equilibrium,  depending  on  the  relative  magni- 
tude of  the  mutation  rate  and  the  coefficient  of  selection.  But  if 
mutation  rates  are  of  the  order  of  magnitude  commonly  accepted, 
i.e.  io-5,  or  thereabouts,  then  only  a  mild  selection  against  the  mutant 
gene  will  be  needed  to  hold  it  at  a  very  low  equilibrium  frequency. 
For  example,  the  following  are  the  equilibrium  frequencies  of  a 
recessive  gene  and  of  the  recessive  homozygote  under  various  intensi- 
ties of  selection  if  the  mutation  rate  is  io-5: 


s  = 

•001 

•01 

•1 

•5 

9  = 

•1 

•03 

•01 

•0045 

q2  = 

•01 

•001 

•0001 

2x10 

Thus,  if  a  gene  mutates  at  the  rate  of  io-5,  a  selective  disadvantage  of 
10  per  cent  is  enough  to  hold  the  frequency  of  the  recessive  homo- 
zygote at  one  in  ten  thousand;  and  a  50  per  cent  disadvantage  will 


oth 


iino 
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hold  it  at  one  in  fifty  thousand.  It  is  quite  clear  therefore  that  the 
low  frequency  of  deleterious  mutants  in  natural  populations  is  in 
accord  with  what  would  be  expected  from  the  joint  action  of  mutation 
and  selection.  A  further  conclusion  is  that  mutation  alone  is  most 
unlikely  to  be  a  cause  of  evolutionary  change.  It  is  not  mutation,  but 
selection,  that  chiefly  determines  whether  a  gene  spreads  through  the 
population  or  remains  a  rare  abnormality,  unless  the  mutation  rate 
is  very  much  higher  than  seems  to  be  the  rule. 

Let  us  now  briefly  consider  two  questions  of  social  importance 
concerning  the  balance  between  selection  and  mutation:  the  effect  of 
an  increase  of  mutation  rate,  and  the  effect  of  a  change  in  the  intensity 
of  selection  against  deleterious  mutants.  These  questions  are  more 
fully  discussed  by  Crow  (1957). 

Increase  of  mutation  rate.  Since  the  products  of  mutation  are 
predominantly  deleterious,  the  process  of  mutation  has  a  harmful 
effect  on  a  proportion  of  the  individuals  in  a  population.  When  an 
individual  dies  or  fails  to  reproduce  in  consequence  of  the  reduced 
fitness  of  its  genotype,  we  may  refer  to  this  as  a  ''genetic  death."  An 
increase  in  the  frequency  of  genetic  deaths  would  reduce  the  poten- 
tial reproductive  rate  and  might  thus  reduce  the  speed  with  which  a 
species  could  multiply  in  an  unoccupied  territory.  But  when  the 
numbers  of  adults  are  held  constant  by  density-dependent  factors, 
even  quite  a  high  frequency  of  genetic  deaths  will  not  affect  the 
ability  of  the  population  to  perpetuate  itself,  especially  if  the  repro- 
ductive rate  is  high,  because  the  death  of  some  individuals  leaves  room 
for  others  that  would  otherwise  have  died  from  lack  of  food  or  some 
ut  if  other  cause.  There  is  a  species  of  Drosophila,  for  example  (D. 
tropicalis,  from  Central  America),  in  which  50  per  cent  of  individuals 
in  a  certain  locality  suffer  genetic  death,  and  yet  the  population 
flourishes  (Dobzhansky  and  Pavlovsky,  1955).  In  species  with  low 
reproductive  rates  the  frequency  of  genetic  deaths  is  of  greater  conse- 
quence, particularly  in  ourselves,  where  the  death  of  every  individual 
is  a  matter  of  concern.   Let  us  therefore  consider  what  effect  is  to  be 

xpected  from  an  increase  of  mutation  rate  such  as  might  be  caused 
by  an  increase  in  the  amount  of  ionising  radiation  to  which  human 
populations  are  exposed. 

Let  us  take  the  case  of  a  recessive  gene  with  a  mutation  rate  (to  it) 
Df  u,  the  gene  being  in  equilibrium  at  a  frequency  of  q.   Then,  if  the 

oefficient  of  selection  against  the  homozygote  is  s,  the  frequency  of 
genetic  deaths  is  sq2.   This  is  the  proportionate  loss  due  to  selection, 

F.Q.G. 
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as  shown  on  p.  29,  and  it  is  equal  to  u,  by  equation  2.12.  Thus  the 
frequency  of  genetic  deaths,  when  equilibrium  has  been  attained, 
depends  on  the  mutation  rate  alone,  and  is  not  influenced  by  the 
degree  of  harmfulness  of  the  gene.  The  reason  for  this  apparent  para- 
dox is  that  the  more  harmful  genes  come  to  equilibrium  at  lower 
frequencies. 

Now,  if  the  mutation  rate  is  increased,  and  maintained  at  the  new 
level,  the  gene  will  begin  to  increase  toward  a  new  point  of  equili^ 
brium  at  which  sq2  will  be  equal  to  the  new  mutation  rate.  Thus  if 
the  mutation  rate  were  doubled  the  frequency  of  genetic  deaths  would 
also  be  doubled,  when  the  new  equilibrium  had  been  reached.  But 
the  approach  to  the  new  equilibrium  would  be  very  slow.  The  change 
of  gene  frequency  in  the  first  generation  is  approximately 

Aq  =  u(i-q)-sq2(i-q) 

u  being  the  new  mutation  rate  (from  equations  2.3  and  2.8,  but 
ctingback  negle  mutation).  To  see  what  this  means  let  us  take  a 
mutation  rate  of  io-5  as  being  probably  representative  of  many  loci, 
and  let  us  suppose  that  this  was  doubled.  We  may  with  sufficient 
accuracy  take  1  -  q  as  unity.  Then 

Aq  =  2  x  io~5  -  io-5 
=  io-5 

The  immediate  effect  of  the  increase  of  mutation  rate  would  there- 
fore be  very  small  indeed. 

Change  of  selection  intensity.  Intensification  of  selection  is 
sometimes  advocated  as  a  eugenic  measure  in  human  populations, 
on  the  grounds  that  if  sufferers  from  genetic  defects  were  prevented 
from  breeding  the  frequency  of  the  defects  would  be  reduced.  We 
saw  from  Example  2.2.  that  the  effect  of  selection  against  a  recessive 
defect  is  very  slow  indeed,  even  when  mutation  is  ignored.  The  true 
situation  is  even  worse.  We  cannot  reduce  the  frequency  of  an 
abnormality,  whether  dominant  or  recessive,  below  the  new  equili- 
brium frequency.  The  serious  defects  have  already  a  fairly  strong 
natural  selection  working  on  them,  and  the  addition  of  artificial 
selection  can  do  no  more  than  make  the  coefficient  of  selection,  s, 
equal  to  1.  This  would  probably  seldom  do  more  than  double  the 
present  coefficient  of  selection,  and  the  incidence  of  defects  would  be 
reduced  to  not  less  than  half  their  present  values  (equations  2.13, 
2.14,  2.15).   With  a  dominant  gene  the  effect  would  be  immediate, 
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but  with  a  recessive  the  approach  to  the  new  equilibrium  would  be 
extremely  slow. 

The  situation  with  respect  to  recessives  is  complicated  by  the 
fact  that  deleterious  recessives  are  certainly  not  at  their  equilibrium 
frequencies  in  present-day  human  populations  (Haldane,  1939). 
The  reason  is  that  modern  civilisation  has  reduced  the  degree  of 
subdivision  (i.e.  inbreeding)  and  so  reduced  the  frequency  of  homo- 
zygotes,  as  will  be  explained  in  the  next  chapter.  In  consequence 
both  the  gene  frequencies  and  the  homozygote  frequencies  are  below 
their  equilibrium  values,  and  must  be  presumed  to  be  at  present 
increasing  slowly  toward  new  equilibria  at  higher  values. 

Perhaps  the  converse  of  the  question  posed  above  is  one  that 
should  give  us  more  concern,  namely  the  consequences  of  the  reduced 
intensity  of  natural  selection  under  modern  conditions.  Minor 
genetic  defects,  such  as  colour-blindness,  must  presumably  have  had 
some  selective  disadvantage  in  the  past  but  now  have  very  little,  if 
any,  effect  on  fitness.  Moreover,  the  development  and  extension  of 
medical  treatment  prolongs  the  lives  of  many  people  with  diseases 
that  have  at  least  some  degree  of  genetic  causation  through  genes  that 
increase  susceptibility.  This  relaxation  of  the  selection  operating  on 
minor  genetic  defects  and  against  genes  concerned  in  the  causation  of 
disease  suggests  that  the  frequencies  of  these  genes  will  increase 
toward  new  equilibria  at  higher  values.  If  this  is  true  we  must  expect 
the  incidence  of  minor  genetic  defects  to  increase  in  the  future,  and 
also  the  proportion  of  people  who  need  medical  treatment  for  a 
variety  of  diseases.  By  applying  humanitarian  principles  for  our  own 
good  now  we  are  perhaps  laying  up  a  store  of  inconvenience  for  our 
descendants  in  the  distant  future. 

Selection  favouring  heterozygotes.  We  have  considered  the 
effects  of  selection  operating  on  genes  that  are  partially  or  fully 
dominant  with  respect  to  fitness;  but,  though  the  appropriate  for- 
mula was  given  in  Table  2.1,  we  have  not  yet  discussed  the  conse- 
quences of  overdominance  with  respect  to  fitness;  that  is,  when  the 
heterozygote  has  a  higher  fitness  than  either  homozygote.  At  first 
sight  it  may  seem  rather  improbable  that  selection  should  favour  the 
heterozygote  of  two  alleles  rather  than  one  or  other  of  the  homo- 
zygotes,  but  there  are  reasons  for  thinking  that  this  in  fact  is  not  at  all 
an  uncommon  situation.  Let  us  first  examine  the  consequences  of 
this  form  of  selection,  and  then  consider  the  evidence  of  its  occur- 
rence in  nature. 
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Selection  operating  on  a  gene  with  partial  or  complete  dominance 
tends  toward  the  total  elimination  of  one  or  other  allele,  the  final  gene 
frequency,  in  the  absence  of  mutation,  being  o  or  i .  When  selection 
favours  the  heterozygote,  however,  the  gene  frequency  tends  toward 
an  equilibrium  at  an  intermediate  value,  both  alleles  remaining  in  the 
population,  even  without  mutation.  The  reason  is  as  follows.  The 
change  of  gene  frequency  after  one  generation  was  given  in  Table  2.1 
as  being 

pq(s1p-s2q) 


Aq 


hp2  -  s2q 


The  condition  for  equilibrium  is  that  Aq  =  o,  and  this  is  fulfilled  when 
s1p=s2q.  The  gene  frequencies  at  this  point  of  equilibrium  are 
therefore  «X» 

-=-  Zl 

q     Sl 

?=*7T72  ^ 

Now,  if  q  is  greater  than  its  equilibrium  value  (but  not  1),  and  p 
therefore  less,  sxp  will  be  less  than  s2q,  and  Aq  will  be  negative;  that  is 
to  say  q  will  decrease.  Similarly  if  q  is  less  than  its  equilibrium  value 
(but  not  o)  it  will  increase.  Therefore  when  the  gene  frequency  has 
any  value,  except  o  or  1 ,  selection  changes  it  toward  the  intermediate 
point  of  equilibrium  given  in  equation  2.16,  and  both  alleles  remain 
permanently  in  the  population.  Three  or  more  alleles  at  a  locus  are 
maintained  in  the  same  way,  provided  the  heterozygote  of  any  pair 
is  superior  in  fitness  to  both  homozygotes  of  that  pair  (Kimura, 
1956).  A  feature  of  the  equilibrium  worthy  of  note  is  that  the  gene 
frequency  depends  not  on  the  degree  of  superiority  of  the  hetero- 
zygote but  on  the  relative  disadvantage  of  one  homozygote  compared 
with  that  of  the  other.  Therefore  there  is  a  point  of  equilibrium  at 
some  more  or  less  intermediate  gene  frequency  whenever  a  hetero- 
zygote is  superior  to  both  the  homozygotes,  no  matter  by  how  little. 

Our  previous  consideration  of  genes  with  complete  dominance 
showed  that  the  balance  between  selection  and  mutation  satisfactorily 
accounts  for  the  presence  of  deleterious  genes  at  low  frequencies, 
causing  the  appearance  of  rare  abnormal,  or  mutant,  individuals. 
Genes  at  intermediate  frequencies,  however,  are  common  in  very 
many  species,  and  the  presence  of  these  cannot  satisfactorily  be 
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accounted  for  in  this  way.   But  the  intermediate  frequencies  are  just 
what  would  be  expected  if  selection  favoured  the  heterozygotes. 
The  existence  in  a  population  of  individuals  with  readily  discernible 
differences  caused  by  genes  at  intermediate  frequencies  is  referred  to 
as  polymorphism.    The  blood  group  differences  of  man  are  perhaps 
the  best  known  examples,  but  antigenic  differences  are  found  also  in 
many  other  species  and  are  probably  universal  in  animals.    More 
striking  forms  of  polymorphism  are  the  colour  varieties  found  in 
many  species,  particularly  among  insects,  snails,  and  fishes.    The 
genes  causing  polymorphism  have  usually  no  obvious  advantage  of 
one  allele  over  another,  all  the  genotypes  being  essentially  normal,  or 
"wild-type,"  individuals.  In  these  circumstances,  as  we  noted  above, 
only  a  very  slight  superiority  of  the  heterozygote  would  be  sufficient 
to  establish  an  equilibrium  at  an  intermediate  gene  frequency.   The 
properties  of  the  genes  concerned  with  polymorphism  seem,  there- 
fore, to  accord  well  with  the  hypothesis  that  selection  is  operating  on 
them  in  favour  of  the  heterozygotes,  and  this  is  generally  conceded  to 
be  the  most  probable  reason  for  their  intermediate  frequencies.  As  a 
general  cause  of  polymorphism,  however,  it  cannot  be  taken  as  fully 
proved,    because   the   superior   fitness   of  heterozygotes   has   been 
demonstrated  in  relatively  few  cases,  and  there  are  other  possible 
reasons  for  the  existence  of  polymorphism.    For  example,  the  genes 
might  be  in  a  transitional  stage  of  a  change  from  one  extreme  to  the 
other  as  a  result  of  slow  environmental  change;  or  the  intermediate 
frequencies  might  be  the  point  of  equilibrium  between  mutation  in 
opposite  directions,  with  virtually  no  selective  advantage  of  one  allele 
over  the  other.  But  these  explanations  seem  improbable,  particularly 
as  some  polymorphisms  are  known  to  be  of  very  long  standing.   The 
polymorphism  of  shell  colours  in  the  land  snail  Cepaea  nemoralis,  for 
example,  goes  back  to  Neolithic  times  (Cain  and  Sheppard,  1954a). 
Another  possible  cause  of  polymorphism  lies  in  the  heterogeneity  of 
the  environment  in  which  a  population  lives.    If  the  differences  of 
environment  influence  the  selection  coefficients  in  such  a  way  that 
one  allele  is  favoured  in  some  conditions  and  another  allele  in  other 
conditions,  then  polymorphism  may  result  provided  that  mating  is 
not  entirely  at  random  over  the  range  of  environments.  (See  Levene, 
1953;  Li,  19556;  Mather,  1955a;  Waddington,  1957.) 

If  heterozygotes  are  indeed  superior  in  fitness,  one  naturally 
wants  to  enquire  into  the  nature  of  their  superiority.  Unfortunately, 
however,  very  little  is  known  about  this,  though  evidence  is  accumu- 
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lating,  in  the  case  of  the  human  blood  groups,  that  certain  blood  groups 
are  associated  with  an  increased  susceptibility  to  certain  diseases 
(Roberts,  1957);  group  O,  for  example,  with  duodenal  ulcer  and  group 
A  with  pernicious  anaemia.  If  one  states  this  the  other  way  round 
and  says  that  the  other  alleles  confer  increased  resistance  to  these 
diseases,  then  it  is  not  unreasonable  to  suppose  that  each  allele 
increases  resistance  to  different  diseases,  and  that  the  presence  of  two 
alleles  increases  the  resistance  to  two  different  diseases,  thereby 
giving  a  selective  advantage  to  the  heterozygote. 

Another  question  of  interest  concerns  the  evolutionary  signifi- 
cance of  polymorphism.  Is  it  an  "adaptive"  feature  of  a  species? 
Does  it,  in  other  words,  confer  some  advantage  over  a  population 
without  it?  Some  think  that  it  does.  (See,  particularly,  Dobzhansky, 
195  ib).  Others,  however,  point  out  that  the  average  fitness  of  a 
population  with  polymorphism  resulting  from  superior  fitness  of 
heterozygotes  is  less  than  that  of  a  population  in  which  a  single 
allele  performs  the  same  function  as  the  two  different  alleles  in  the 
heterozygote  (Cain  and  Sheppard,  19546).  On  this  view,  polymor- 
phism is  a  situation  that,  once  established,  is  perpetuated  by  selection 
between  individuals  within  the  population,  but  is  a  disadvantage  to 
the  population  as  a  whole  in  competition  with  another  population 
lacking  the  polymorphism. 

The  foregoing  account  of  polymorphism  leaves  many  problems 
unsolved,  and  does  little  more  than  sketch  the  outlines  of  a  most 
interesting  aspect  of  the  genetics  of  populations.  In  particular,  we 
have  not  mentioned  the  extensive  and  detailed  investigations  of  poly- 
morphism in  respect  of  inverted  segments  of  chromosomes  found  in 
species  of  Drosophila  and,  to  a  lesser  extent,  in  some  other  animals 
and  plants.  For  a  description  of  these  studies,  and  also  for  a  fuller 
general  account  of  polymorphism,  the  reader  must  be  referred  to 
Dobzhansky  (1951a).  We  conclude  by  giving  one  example  of  poly- 
morphism where  the  nature  of  the  superiority  of  heterozygotes  is 
clear.  Other  cases  are  described  by  Dobzhansky  (1951a),  Ford 
(1953),  Lerner  (1954),  and  Sheppard  (1958). 

Example  2.4.  Sickle-cell  anaemia  (Allison,  1955).  There  is  a  gene, 
found  in  American  negroes  and  in  the  indigenous  East  Africans,  which 
causes  the  formation  of  an  abnormal  type  of  haemoglobin.  Homozygotes 
suffer  from  an  anaemia,  characterised  by  the  "sickle"  shape  of  the  erythro- 
cytes; it  is  a  severe  disease  from  which  many  die.  All  the  haemoglobin  of 
homozygotes  is  of  the  abnormal  type,  though  there  is  a  variable  admixture 
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of  foetal  haemoglobin.  Heterozygotes  do  not  suffer  from  anaemia,  but 
they  can  be  recognised  by  the  presence  of  sickle  cells  if  the  haemoglobin 
is  deoxygenated.  About  35  per  cent  of  their  haemoglobin  is  of  the  ab- 
normal type.  With  respect  to  haemoglobin  synthesis,  therefore,  the  sickle- 
cell  gene  is  partially  dominant,  though  with  respect  to  the  anaemia  it  is 
recessive,  and  with  respect  to  fitness  it  has  been  proved  to  be  over- 
dominant.  In  routine  surveys  the  few  surviving  homozygotes  are  not 
readily  distinguished  from  heterozygotes;  we  shall  refer  to  the  combined 
heterozygotes  and  surviving  homozygotes  as  "abnormals."  The  frequency 
of  abnormals  varies  very  much  with  the  locality:  in  American  negroes  it 
is  about  9  per  cent,  and  in  different  parts  of  Africa  it  varies  from  zero  up 
to  a  maximum  of  about  40  per  cent.  In  view  of  the  severe  disability  of  the 
homozygotes  it  is  impossible  to  account  for  these  high  frequencies  unless 
the  heterozygotes  have  a  quite  substantial  selective  advantage  over  the 
normal  homozygotes.  The  nature  of  this  selective  advantage  has  been 
shown  to  be  connected  with  resistance  to  malaria.  Heterozygotes  are  less 
susceptible  to  malaria  than  normal  homozygotes,  and  the  frequency  of 
abnormals  in  different  areas  is  correlated  with  the  prevalence  of  malaria. 
Let  us  work  out  the  gene  frequency  corresponding  with  the  maximum 
frequency  of  40  per  cent  abnormals,  and  then  find  the  magnitude  of  the 
selective  advantage  of  heterozygotes  necessary  to  maintain  this  gene 
frequency  in  equilibrium. 

If  the  gene  frequency  is  in  equilibrium  it  will  be  the  same  after  selec- 
tion has  taken  place  as  it  was  before.  Therefore,  if  we  assume  that  all  the 
selection  takes  place  before  adulthood — an  assumption  that  is  not  very  far 
from  the  truth — we  can  estimate  the  gene  frequency  from  the  genotype 
frequencies  in  the  adult  population.  But  it  is  first  necessary  to  know  what 
proportion  of  abnormals  are  homozygotes.  This  has  been  estimated  as 
being  approximately  2-9  per  cent  (Allison,  1954).  Thus,  when  the  fre- 
quency of  abnormals  is  0-4,  the  frequency  of  homozygotes  is  0-012,  and 
that  of  heterozygotes  is  0-388.  The  gene  frequency,  then,  by  equation  1.1, 
is  the  frequency  of  homozygotes  plus  half  the  frequency  of  heterozygotes, 
which  comes  to  q  =  0-206.  If  this  gene  frequency  is  the  equilibrium  value 
maintained  by  natural  selection  favouring  the  heterozygotes,  and  if  we 
assume  mating  to  be  random,  then  the  gene  frequency  is  related  to  the 
selection  coefficients  by  equation  2.16.  The  fitness  of  sickle-cell  homo- 
zygotes, relative  to  that  of  heterozygotes,  has  been  estimated  from  a 
comparison  of  viability  and  fertility  as  being  approximately  0-25.  There- 
fore the  coefficient  of  selection  against  homozygotes  is  sa  =  0'75'  Substi- 
tuting this  value  of  s2,  and  the  value  of  q  found  above,  in  equation  2.16 
gives  ^  =  0-197.  This  is  the  coefficient  of  selection  against  normal  homo- 
zygotes, relative  to  heterozygotes.  If  we  want  to  express  the  selective 
advantage  of  heterozygotes  as  the  superiority  of  heterozygotes,  relative  to 
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normal  homozygotes,  we  may  do  so,  since  the  fitness  of  heterozygotes 


relative  to  normal  homozygotes  is 


.  This  is  1-24.  Thus  the  selective 


advantage  to  be  attributed  to  the  resistance  of  heterozygotes  to  malaria, 
if  these  are  the  forces  holding  the  gene  in  equilibrium,  is  24  per  cent. 

The  presence  of  the  sickle-cell  gene  in  American  negroes  can  be 
attributed  to  their  African  origin.  The  gene's  present  frequency  of  0-046, 
deduced  in  the  manner  described  above,  can  be  accounted  for  partly  by 
racial  mixture  and  partly  by  the  change  of  habitat  which,  removing  the 
advantage  of  heterozygotes,  has  exposed  the  gene  to  the  full  power  of  the 
selection  against  homozygotes. 

As  an  example  of  polymorphism  the  sickle-cell  gene  is  not  altogether 
typical,  because  the  differences  of  fitness  are  rather  large  and  one  of  the 
genotypes  is  clearly  abnormal.  But  it  illustrates  in  an  exaggerated  form 
the  nature  of  the  selective  forces  that  are  presumed  to  underlie  the  more 
usual  forms  of  polymorphism. 


CHAPTER   3 

SMALL  POPULATIONS: 

I.  Changes  of  Gene  Frequency  under 
Simplified  Conditions 

We  have  now  to  consider  the  last  of  the  agencies  through  which  gene 
frequencies  can  be  changed.  This  is  the  dispersive  process,  which 
differs  from  the  systematic  processes  in  being  random  in  direction, 
and  predictable  only  in  amount.  In  order  to  exclude  this  process 
from  the  previous  discussions  we  have  postulated  always  a  "large" 
population,  and  we  have  seen  that  in  a  large  population  the  gene 
frequencies  are  inherently  stable.  That  is  to  say,  in  the  absence  of 
migration,  mutation,  or  selection,  the  gene  and  genotype  frequencies 
remain  unaltered  from  generation  to  generation.  This  property  of 
stability  does  not  hold  in  a  small  population,  and  the  gene  frequencies 
are  subject  to  random  fluctuations  arising  from  the  sampling  of 
gametes.  The  gametes  that  transmit  genes  to  the  next  generation 
carry  a  sample  of  the  genes  in  the  parent  generation,  and  if  the  sample 
is  not  large  the  gene  frequencies  are  liable  to  change  between  one 
generation  and  the  next.  This  random  change  of  gene  frequency 
is  the  dispersive  process. 

The  dispersive  process  has,  broadly  speaking,  three  important 
consequences.  The  first  is  differentiation  between  sub-populations. 
The  inhabitants  of  a  large  area  seldom  in  nature  constitute  a  single 
large  population,  because  mating  takes  place  more  often  between 
inhabitants  of  the  same  region.  Natural  populations  are  therefore 
more  or  less  subdivided  into  local  groups  or  sub-populations,  and  the 
sampling  process  tends  to  cause  genetic  differences  between  these,  if 
the  number  of  individuals  in  the  groups  is  small.  Domesticated  or 
laboratory  populations,  in  the  same  way,  are  often  subdivided — for 
example,  into  herds  or  strains — and  in  them  the  subdivision  and  its 
resultant  differentiation  are  often  more  marked.  The  second  con- 
sequence is  a  reduction  of  genetic  variation  within  a  small  population. 
The  individuals  of  the  population  become  more  and  more  alike  in 
genotype,  and  this  genetic  uniformity  is  the  reason  for  the  widespread 
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use  of  inbred  strains  of  laboratory  animals  in  physiological  and  allied 
fields  of  research.  (An  inbred  strain,  it  may  be  noted,  is  a  small 
population.)  The  third  consequence  of  the  dispersive  process  is  an 
increase  in  the  frequency  of  homozygotes  at  the  expense  of  hetero- 
zygotes.  This,  coupled  with  the  general  tendency  for  deleterious 
alleles  to  be  recessive,  is  the  genetic  basis  of  the  loss  of  fertility  and 
viability  that  almost  always  results  from  inbreeding.  To  explain 
these  three  consequences  of  the  dispersive  process  is  the  chief  purpose 
of  this  chapter. 

There  are  two  different  ways  of  looking  at  the  dispersive  process 
and  of  deducing  its  consequences.  One  is  to  regard  it  as  a  sampling 
process  and  to  describe  it  in  terms  of  sampling  variance.  The  other 
is  to  regard  it  as  an  inbreeding  process  and  describe  it  in  terms  of  the 
genotypic  changes  resulting  from  matings  between  related  indi- 
viduals. Of  these,  the  first  is  probably  the  simpler  for  a  description 
of  how  the  process  works,  but  the  second  provides  a  more  convenient 
means  of  stating  the  consequences.  The  plan  to  be  followed  here  is 
first  to  describe  the  general  nature  of  the  dispersive  process  from  the 
point  of  view  of  sampling.  This  will  show  how  the  three  chief  con- 
sequences come  about.  Then  we  shall  approach  the  process  afresh 
from  the  point  of  view  of  inbreeding,  and  show  how  the  two  view- 
points connect  with  each  other.  In  all  this  we  shall  confine  our 
attention  to  the  simplest  possible  situation,  excluding  migration, 
mutation,  and  selection.  Thus  we  shall  see  what  happens  in  small 
populations  in  the  absence  of  other  factors  influencing  gene  frequency. 
In  the  next  chapter  we  shall  extend  the  conclusions  to  more  realistic 
situations,  by  removing  the  restrictive  simplifications,  and  we  shall  in 
particular  consider  the  joint  effects  of  the  dispersive  process  and  the 
systematic  processes.  Finally,  in  Chapter  5,  we  shall  consider  the 
special  cases  of  pedigreed  populations,  and  very  small  populations 
maintained  by  regular  systems  of  close  inbreeding. 

The  Idealised  Population 

In  order  to  reduce  the  dispersive  process  to  its  simplest  form  we 
imagine  an  idealised  population  as  follows.  We  suppose  there  to  be 
initially  one  large  population  in  which  mating  is  random,  and  this 
population  becomes  subdivided  into  a  large  number  of  sub-popula- 
tions. The  subdivision  might  arise  from  geographical  or  ecological 
causes  under  natural  conditions,   or  from  controlled  breeding  in 
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domesticated  or  laboratory  populations.  The  initial  random-mating 
population  will  be  referred  to  as  the  base  population,  and  the  sub- 
populations  will  be  referred  to  as  lines.  All  the  lines  together  consti- 
tute the  whole  population,  and  each  line  is  a  "small  population"  in 
which  gene  frequencies  are  subject  to  the  dispersive  process.  When  a 
single  locus  is  under  discussion  we  cannot  properly  understand  what 
goes  on  in  one  line  except  by  considering  it  as  one  of  a  large  number 
of  lines.  But  what  happens  to  the  genes  at  one  locus  in  a  number  of 
lines  happens  equally  to  those  at  a  number  of  loci  in  one  line,  pro- 
vided they  all  start  at  the  same  gene  frequency.  So  the  consequences 
of  the  process  apply  equally  to  a  single  line  provided  we  consider 
many  loci  in  it. 

The  simplifying  conditions  specified  for  the  idealised  population 
are  the  following: 

i.  Mating  is  restricted  to  members  of  the  same  line.  The  lines 
are  thus  isolated  in  the  sense  that  no  genes  can  pass  from  one  line  to 
another.  In  other  words  migration  is  excluded. 

2.  The  generations  are  distinct  and  do  not  overlap. 

3.  The  number  of  breeding  individuals  in  each  line  is  the  same  for 
all  lines  and  in  all  generations.  Breeding  indviduals  are  those  that 
transmit  genes  to  the  next  generation. 

4.  Within  each  line  mating  is  random,  including  self-fertilisation 
in  random  amount. 

5.  There  is  no  selection  at  any  stage. 

6.  Mutation  is  disregarded. 
The  situation  implied  by  these  conditions  is  represented  dia- 

grammatically  in  Fig.  3.1,  and  may  be  described  thus:  All  breeding 
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Fig.  3.1.  Diagrammatic  representation  of  the  subdivision  of  a 
single  large  population — the  base  population — into  a  number  of 
sub-populations,  or  lines. 
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individuals  contribute  equally  to  a  pool  of  gametes  from  which  zygotes 
will  be  formed.  Union  of  gametes  is  strictly  random.  Out  of  a 
potentially  large  number  of  zygotes  only  a  limited  number  survive  to 
become  breeding  individuals  in  the  next  generation,  and  this  is  the 
stage  at  which  the  sampling  of  the  genes  transmitted  by  the  gametes 
takes  place.  Survival  of  zygotes  is  random,  and  consequently  the 
contribution  of  the  parents  to  the  next  generation  is  not  uniform,  but 
varies  according  to  the  chances  of  survival  of  their  progeny.  Since 
the  population  size  is  constant  from  generation  to  generation,  the 
average  number  of  progeny  that  reach  breeding  age  is  one  per 
individual  parent  or  two  per  mated  pair  of  parents.  For  any  particular 
zygote  the  chance  of  survival  is  small,  and  therefore  the  number  of 
progeny  contributed  by  individual  parents,  or  by  pairs  of  parents,  has 
a  Poisson  distribution. 

The  following  symbols  will  be  used  in  connexion  with  the 
idealised  population. 

N=the  number  of  breeding  individuals  in  each  line  and  genera- 
tion. This  is  the  population  size. 
/  =  time,  in  generations,  starting  from  the  base  population  at  t0. 
q = frequency  of  a  particular  allele  at  a  locus. 
p  =  i  -  q  =  frequency  of  all  other  alleles  at  that  locus,  q  and  p  refer 
to  the  frequencies  in  any  one  line;  q  and  p  refer  to  the  fre- 
quencies in  the  whole  population  and  are  the  means  of  q  and^>; 
q0  andpQ  are  the  frequencies  in  the  base  population. 

Sampling 

Variance  of  gene  frequency.  The  change  of  gene  frequency 
resulting  from  sampling  is  random  in  the  sense  that  its  direction  is 
unpredictable.  But  its  magnitude  can  be  predicted  in  terms  of  the 
variance  of  the  change.  Consider  the  formation  of  the  lines  from 
the  base  population.  Each  line  is  formed  from  a  sample  of  N  in- 
dividuals drawn  from  the  base  population.  Since  each  individual 
carries  two  genes  at  a  locus,  the  sub-division  of  the  population 
represents  a  series  of  samples  each  of  2N  genes,  drawn  at  random 
from  the  base  population.  The  gene  frequencies  in  these  samples 
will  have  an  average  value  equal  to  that  in  the  base  population,  i.e. 
q0,  and  will  be  distributed  about  this  mean  with  a  variance  p0q0/2N, 
which  is  simply  the  variance  of  a  ratio,  the  sample  size  being  in  this 
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case  2N.    Thus  the  change  of  gene  frequency,  Aqf  resulting  from 
sampling  in  one  generation,  can  be  stated  in  terms  of  its  variance  as 


.(**) 


2  _Mo 
°A«     2N 

This  variance  of  Aq  expresses  the  magnitude  of  the  change  of  gene 
frequency  resulting  from  the  dispersive  process.  It  expresses  the 
expected  change  in  any  one  line,  or  the  variance  of  gene  frequencies 
that  would  be  found  among  many  lines  after  one  generation.  Its 
effect  is  a  dispersion  of  gene  frequencies  among  the  lines;  in  other 
words  the  lines  come  to  differ  in  gene  frequency,  though  the  mean 
in  the  population  as  a  whole  remains  unchanged. 

In  the  next  generation  the  sampling  process  is  repeated,  but  each 
line  now  starts  from  a  different  gene  frequency  and  so  the  second 
sampling  leads  to  a  further  dispersion.  The  variance  of  the  change 
now  differs  among  the  lines,  since  it  depends  on  the  gene  frequency, 
qlt  in  the  first  generation  of  each  line  separately.  The  effect  of  con- 
tinued sampling  through  successive  generations  is  that  each  line 
fluctuates  irregularly  in  gene  frequency,  and  the  lines  spread  apart  pro- 
gressively, thus  becoming  differentiated.  The  erratic  changes  of  gene 
frequency  shown  by  the  individual  lines  are  exemplified  in  Fig.  3.2; 
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GENERATIONS 
Fig.  3.2.     Random  drift  of  the  colour  gene  "non-agouti"  in  three 
lines  of  mice,  each  maintained  by  6  pairs  of  parents  per  generation. 
(Original  data.) 
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Fig.  3.3.  Distributions  of  gene  frequencies  in  19  consecutive 
generations  among  105  lines  of  Drosophila  melanogaster ,  each  of  16 
individuals.  The  gene  frequencies  refer  to  two  alleles  at  the 
"brown"  locus  (bw™  and  bw),  with  initial  frequencies  of  0-5.  The 
height  of  each  black  column  shows  the  number  of  lines  having  the 
gene  frequency  shown  on  the  scale  below.  (From  Buri,  1956; 
reproduced  by  courtesy  of  the  author  and  the  editor  of  Evolution.) 
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and  the  consequent  differentiation,  or  spreading  apart,  of  the  lines 
in  Fig.  3.3.  These  changes  of  gene  frequency  resulting  from  samp- 
ling in  small  populations  are  known  as  random  drift  (Wright,  193 1). 
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Fig.  3.4.  Variance  of  gene  frequencies  among  lines  in  the  ex- 
periment illustrated  in  Fig.  3.3.  The  circles  are  the  observed  values, 
and  the  smooth  curve  shows  the  expected  variance  as  given  by 
equation  3.2.  The  value  taken  for  N  is  1 1  -5,  which  is  the  "effective 
number,"  Ne,  as  explained  in  the  next  chapter.  (Data  from  Buri, 
1956.) 

As  the  dispersive  process  proceeds,  the  variance  of  gene  frequency 
among  the  lines  increases,  as  shown  in  Fig.  3.4.  At  any  generation,  t, 
the  variance  of  gene  frequencies,  o-J,  among  the  lines  is  as  follows 


(see  Crow,  1954): 


-p^[I-[I-^)i] 


(3.2) 


Since  the  mean  gene  frequency  among  all  the  lines  remains  unchanged, 
q=q0.  We  may  note  a  fact  that  will  be  needed  later,  and  is  obvious 
from  equation  3.2,  namely  that  g^  —  u1.  The  dispersion  of  the  gene 
frequencies,  which  we  have  described  by  reference  to  one  locus  in 
many  lines,  could  equally  well  be  described  by  reference  to  the 
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frequencies  at  a  number  of  different  loci  in  one  line,  provided  they  all 
started  from  the  same  initial  frequency,  and  were  unlinked. 

Fixation.  There  are  limits  to  the  spreading  apart  of  the  lines  that 
can  be  brought  about  by  the  dispersive  process.  The  gene  frequency 
cannot  change  beyond  the  limits  of  o  or  i,  and  sooner  or  later  each 
line  must  reach  one  or  other  of  these  limits.  Moreover,  the  limits  are 
"traps"  or  points  of  no  return,  because  once  the  gene  frequency  has 
reached  o  or  i  it  cannot  change  any  more  in  that  line.  When  a 
particular  allele  has  reached  a  frequency  of  i  it  is  said  to  be  fixed  in 
that  line,  and  when  it  reaches  a  frequency  of  o  it  is  lost.  When  an 
allele  reaches  fixation  no  other  allele  can  be  present  in  that  line,  and 
the  line  may  then  be  said  to  be  fixed.  When  a  line  is  fixed  all  indi- 
viduals in  it  are  of  identical  genotype  with  respect  to  that  locus. 
Eventually  all  lines,  and  all  loci  in  a  lino,  become  fixed.  The  indi- 
viduals of  a  line  are  then  genetically  identical,  and  this  is  the  basis  of 
the  genetic  uniformity  of  highly  inbred  strains. 

The  proportion  of  the  lines  in  which  different  alleles  at  a  locus  are 
fixed  is  equal  to  the  initial  frequencies  of  the  alleles.  If  the  base 
population  contains  two  alleles  Ax  and  A2  at  frequencies  p0  and  q0 
respectively,  then  Ax  will  be  fixed  in  the  proportion  p0  of  the  lines, 
and  A2  in  the  remaining  proportion,  q0.  The  variance  of  the  gene 
frequency  among  the  lines  is  then  p0q0,  as  may  be  seen  from  equation 
3.2  by  putting  t  equal  to  infinity.  (In  Fig.  3.3  the  lines  in  which 
fixation  or  loss  has  just  occurred  are  shown,  but  not  those  in  which  it 
occurred  earlier.) 

When  concerned  with  the  attainment  of  genetic  uniformity  one 
wants  to  know  how  soon  fixation  takes  place;  what  is  the  probability 
of  a  particular  locus  being  fixed,  or  what  proportion  of  all  loci  in  a 
line  will  be  fixed,  after  a  certain  number  of  generations.  Considera- 
tion of  the  progressive  nature  of  the  dispersion,  as  illustrated  in  Fig. 
3.3,  will  show  that  fixation  does  not  start  immediately;  the  dispersion  of 
gene  frequencies  must  proceed  some  way  before  any  line  is  likely  to 
reach  fixation.  To  deduce  the  probability  of  fixation  is  mathemati- 
cally complicated  (see  particularly  Wright,  193 1;  Kimura,  1955),  and 
only  an  outline  of  the  conclusions  can  be  given  here.  There  are  two 
phases  in  the  dispersive  process:  during  the  initial  phase  the  gene 
frequencies  are  spreading  out  from  the  initial  value;  this  leads  to  a 
steady  phase,  when  the  gene  frequencies  are  evenly  spread  out  over 
the  range  between  the  two  limits,  and  all  gene  frequencies  except  the 
two  limits  are  equally  probable.    The  duration  of  the  initial  phase 
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in  generations  is  a  small  multiple  of  the  population  size,  depending 
on  the  initial  gene  frequency.  With  q0  =  O'S  it  lasts  about  zN  genera- 
tions, and  with  ^0  =  o- 1  it  lasts  about  4^  generations  (Kimura,  1955). 
(In  the  experiment  illustrated  in  Fig.  3.3  it  lasted  till  about  the 
seventeenth  generation.)  The  theoretical  distributions  of  gene 
frequency  during  the  initial  phase,  with  original  frequencies  of  0-5 
and  o-i,  are  shown  in  Fig.  3.5. 
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Fig.  3.5.  Theoretical  distributions  of  gene  frequency  among  lines. 
The  initial  and  mean  gene  frequency  is  0-5  in  the  left  hand  figure, 
and  o*i  in  the  right  hand  figure.  Previously  fixed  lines  are  excluded. 
N= population  size;  T=time  in  generations.  Note  the  general 
agreement  of  the  left  hand  figure  with  the  observed  distributions 
shown  in  Fig.  3.3.  (From  Kimura,  1955;  reproduced  by  courtesy 
of  the  author  and  the  editor  of  the  Proc.  Nat.  Acad.  Set.  Wash.) 

To  visualise  the  process  one  might  think  of  a  pile  of  dry  sand  in  a 
narrow  trough  open  at  the  two  ends.  Agitation  of  the  trough  will 
cause  the  pile  to  spread  out  along  the  trough,  till  eventually  it  is 
evenly  spread  along  its  length.  Toward  the  end  of  the  spreading  out 
some  of  the  sand  will  have  fallen  off  the  ends  of  the  trough,  and  this 
represents  fixation  and  loss.    Continued  agitation  after  the  sand  is 
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evenly  spread  will  cause  it  to  fall  off  the  ends  at  a  steady  rate,  and  the 
depth  of  sand  left  in  the  trough  will  be  continually  reduced  at  a 
steady  rate  until  in  the  end  none  is  left.  The  initial  gene  frequency  is 
represented  by  the  position  of  the  initial  pile  of  sand.  If  it  is  near  one 
end  of  the  trough,  much  of  the  sand  will  have  fallen  off  that  end  be- 


10  12 

GENERATIONS 

Fig.  3.6.  Fixation  and  loss  occurring  among  107  lines  of  Droso- 
phila  melanogaster,  during  19  generations.  This  is  not  the  same 
experiment  as  that  illustrated  in  Figs.  3.3  and  3.4,  but  was  similar 
in  nature.  There  were  16  parents  per  generation  in  each  line,  and 
the  effective  number  (see  chapter  4)  was  9.  The  closed  circles 
show  the  percentage  of  lines  in  which  the  bw75  allele  has  become 
fixed;  the  open  circles  show  the  percentage  in  which  it  has  been 
lost  and  the  bw  allele  fixed.  The  smooth  curve  is  the  expected 
amount  of  fixation  of  one  or  other  allele,  computed  from  the  effec- 
tive number  by  equation  3.3.    (Data  from  Buri,  1956.) 

fore  any  reaches  the  other  end,  and  the  total  amount  falling  off  each 
end  will  be  in  proportion  to  the  relative  distance  of  the  initial  pile 
from  the  two  ends.  Relating  this  model  to  the  diagram  of  the  process 
in  Fig.  3.5,  the  position  along  the  trough  represents  the  horizontal 
axis,  or  gene  frequency,  and  the  depth  of  the  sand  represents  the 
vertical  axis,  or  the  probability  of  a  line  having  a  particular  gene 
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frequency.   The  graphs  are  thus  analogous  to  longitudinal  sections 
through  the  trough  and  its  sand. 

The  probability  of  fixation  at  any  time  during  the  initial  phase  is 
too  complicated  for  explanation  here,  and  the  reader  is  referred  to 
the  papers  of  Kimura  (1954,  1955).  After  the  steady  phase  has  been 
reached  fixation  proceeds  at  a  constant  rate:  a  proportion  ijzN  of  the 
lines  previously  unfixed  become  fixed  in  each  generation.  The 
proportion  of  lines  in  which  a  gene  with  initial  frequency  q0  is 
expected  to  be  fixed,  lost,  or  to  be  still  segregating  is  as  follows 
(Wright,  1952a): 

fixed:        q0-3PoqoP 

lost:  po~3PoqoP 

neither:  bp^qJP 


where  P 


4-^y 


Fig.  3.6  shows  the  progress  of  fixation  and  loss  in  an  experiment 
with  Drosophila. 

Genotype  frequencies.  Change  of  gene  frequency  leads  to 
change  of  genotype  frequencies;  so  the  genotype  frequencies  in  small 
populations  follow  the  changes  of  gene  frequency  resulting  from  the 
dispersive  process.  In  the  idealised  population,  which  we  are  still 
considering,  mating  is  random  within  each  of  the  lines.  Consequently 
the  genotype  frequencies  in  any  one  line  are  the  Hardy- Weinberg 
frequencies  appropriate  to  the  gene  frequency  in  the  previous  genera- 
tion of  that  line.  As  the  lines  drift  apart  in  gene  frequency  they 
become  differentiated  also  in  genotype  frequencies.  But  differentia- 
tion is  not  the  only  aspect  of  the  change:  the  general  direction  of  the 
change  is  toward  an  increase  of  homozygous,  and  a  decrease  of 
heterozygous,  genotypes.  The  reason  for  this  is  the  dispersion  of  gene 
frequencies  from  intermediate  values  toward  the  extremes.  Hetero- 
zygotes  are  most  frequent  at  intermediate  gene  frequencies  (see 
Fig.  1.1),  so  the  drift  of  gene  frequencies  toward  the  extremes  leads, 
on  the  average,  to  a  decline  in  the  frequency  of  heterozygotes. 

The  genotype  frequencies  in  the  population  as  a  whole  can  be 
deduced  from  a  knowledge  of  the  variance  of  gene  frequencies  in  the 
following  way.  If  an  allele  has  a  frequency  q  in  one  particular  line, 
homozygotes  of  that  allele  will  have  a  frequency  of  q2  in  that  line. 
The  frequency  of  these  homozygotes  in  the  population  as  a  whole  will 
therefore  be  the  mean  value  of  q2  over  all  lines.   We  shall  write  this 
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mean  frequency  of  homozygotes  as  (q2).  The  value  of  (q2)  can  be 
found  from  a  knowledge  of  the  variance  of  gene  frequencies  among 
the  lines,  by  noting  that  the  variance  of  a  set  of  observations  is  found 
by  deducting  the  square  of  the  mean  from  the  mean  of  the  squared 
observations.  Thus 

and  («2)=<P  +  °S2  (3-4) 

where  o\  is  the  variance  of  gene  frequencies  among  the  lines,  as  given 
in  equation  3.2,  and  q2  is  the  square  of  the  mean  gene  frequency. 
Since  the  mean  gene  frequency,  q,  is  equal  to  the  original,  q0>  it 
follows  that  q2  or  q%  is  the  original  frequency  of  homozygotes  in  the 
base  population.  Thus  in  the  population  as  a  whole  the  frequency  of 
homozygotes  of  a  particular  allele  increases,  and  is  always  in  excess 
of  the  original  frequency  by  an  amount  equal  to  the  variance  of  the 
gene  frequency  among  the  lines.  In  a  two-allele  system  the  same 
applies  to  the  other  allele,  and  the  frequency  of  heterozygotes  is 
reduced  correspondingly.  Noting  from  equation  3.2  that  o\  —  a\  we 
therefore  find  the  genotypic  frequencies  for  a  locus  with  two  alleles 
as  follows: 


(3.5) 


Frequency  in 

Genotype 

whole  population 

AA 

Po  +  rf 

AiA2 

2p0q0-2or* 

A2A2 

ql  +  a2 

These  genotype  frequencies  are  no  longer  the  Hardy- Weinberg 
frequencies  appropriate  to  the  original  or  mean  gene  frequency.  The 
Hardy- Weinberg  relationships  between  gene  frequency  and  genotype 
frequencies,  though  they  hold  good  within  each  line  separately,  do 
not  hold  if  the  lines  are  taken  together  and  regarded  as  a  single 
population.  This  fact  causes  some  difficulty  in  relating  gene  and 
genotype  frequencies  in  natural  populations,  because  they  are  often 
more  or  less  subdivided  and  the  degree  of  subdivision  is  seldom 
known.  An  example  of  the  decrease  of  heterozygotes  resulting  from 
the  dispersion  of  gene  frequencies  is  shown  in  Fig.  3.7. 

The  foregoing  account  of  genotype  frequencies  describes  the 
situation  in  terms  of  one  locus  in  many  lines.  It  can  be  regarded 
equally  as  referring  to  many  loci  in  one  line;  then  the  change  in  any 
one  line  or  small  population  is  an  increase  in  the  number  of  loci  at 
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which  individuals  are  homozygous  and  a  corresponding  decrease  in 
the  number  at  which  they  are  heterozygous — in  short  an  increase  of 
homozygotes  at  the  expense  of  heterozygotes.  This  change  of  geno- 
type frequencies  resulting  from  the  dispersive  process  is  the  genetic 
basis  of  the  phenomenon  of  inbreeding  depression,  of  which  a  full 
explanation  will  be  found  in  Chapter  14. 
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Fig.  3.7.  Change  of  frequency  of  heterozygotes  among  105  lines 
of  Drosophila  melanogaster,  each  with  16  parents.  The  same  ex- 
periment as  is  illustrated  in  Figs.  3.3.  and  3.4.  The  frequency  of 
heterozygotes  refers  to  the  population  as  a  whole,  all  lines  taken 
together.  The  smooth  curve  is  the  expected  frequency  of  hetero- 
zygotes. (Data  from  Buri,  1956.) 

We  have  now  surveyed  the  general  nature  of  the  dispersive  process 
and  its  three  major  consequences — differentiation  of  sub-populations, 
genetic  uniformity  within  sub-populations,  and  overall  increase  in 
the  frequency  of  homozygous  genotypes.  Let  us  now  look  at  the 
process  from  another  viewpoint,  as  an  inbreeding  process.  Instead 
of  regarding  the  increase  of  homozygotes  as  a  consequence  of  the 
dispersion  of  gene  frequencies,  we  shall  now  look  directly  at  the 
manner  in  which  the  additional  homozygotes  arise. 
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Inbreeding 

Inbreeding  means  the  mating  together  of  individuals  that  are 
related  to  each  other  by  ancestry.  That  the  degree  of  relationship 
between  the  individuals  in  a  population  depends  on  the  size  of  the 
population  will  be  clear  by  consideration  of  the  numbers  of  possible 
ancestors.  In  a  population  of  bisexual  organisms  every  individual 
has  two  parents,  four  grand-parents,  eight  great-grandparents,  etc., 
and  t  generations  back  it  has  2*  ancestors.  Not  very  many  generations 
back  the  number  of  individuals  required  to  provide  separate  ancestors 
for  all  the  present  individuals  becomes  larger  than  any  real  popula- 
tion could  contain.  Any  pair  of  individuals  must  therefore  be  related 
to  each  other  through  one  or  more  common  ancestors  in  the  more  or 
less  remote  past;  and  the  smaller  the  size  of  the  population  in  previous 
generations  the  less  remote  are  the  common  ancestors,  or  the  greater 
their  number.  Thus  pairs  mating  at  random  are  more  closely  related 
to  each  other  in  a  small  population  than  in  a  large  one.  This  is  why 
the  properties  of  small  populations  can  be  treated  as  the  consequences 
of  inbreeding. 

The  essential  consequence  of  two  individuals  having  a  common 
ancestor  is  that  they  may  both  carry  replicates  of  one  of  the  genes 
present  in  the  ancestor;  and  if  they  mate  they  may  pass  on  these 
replicates  to  their  offspring.  Thus  inbred  individuals — that  is  to 
say,  offspring  produced  by  inbreeding — may  carry  two  genes  at  a 
locus  that  are  replicates  of  one  and  the  same  gene  in  a  previous 
generation.  Consideration  of  this  consequence  of  inbreeding  shows 
that  there  are  two  sorts  of  identity  among  allelic  genes,  and  two  sorts 
of  homozygote.  The  sort  of  identity  we  have  hitherto  considered  is  a 
functional  identity.  Two  genes  are  regarded  as  being  identical  if  they 
are  not  recognisably  different  in  their  phenotypic  effects,  or  by  any 
other  functional  criterion;  in  other  words,  if  they  have  the  same 
allelemorphic  state.  Following  the  terminology  of  Crow  (1954)  they 
may  be  called  alike  in  state.  An  individual  carrying  a  pair  of  such  genes 
is  a  homozygote  in  the  ordinary  sense.  The  new  sort  of  identity  is 
one  of  replication.  If  two  genes  originated  from  the  replication  of  one 
gene  in  a  previous  generation,  they  may  be  said  to  be  identical  by 
descent,  or  simply  identical.  An  individual  possessing  two  identical 
genes  at  a  locus  may  be  called  an  identical  homozygote.  Genes  that 
are  not  identical  by  descent  may  be  called  independent,  whether  they 


Chap,  3] 


INBREEDING 


61 


are  alike  in  state  or  different  alleles;  and  homozygotes  of  independent 
genes  may  be  called  independent  homozygotes. 

Identity  by  descent  provides  the  basis  for  a  measure  of  the  dis- 
persive process,  through  the  degree  of  relationship  between  the 
mating  pairs.  The  measure  is  the  coefficient  of  inbreeding,  which  is  the 
probability  that  the  two  genes  at  any  locus  in  an  individual  are  identi- 
cal by  descent.  It  refers  to  an  individual  and  expresses  the  degree  of 
relationship  between  the  individual's  parents.  If  the  parents  mated 
at  random  then  the  coefficient  of  inbreeding  of  the  progeny  is  the 
probability  that  two  gametes  taken  at  random  from  the  parent 
generation  carry  identical  genes  at  a  locus.  The  coefficient  of  in- 
breeding, generally  symbolised  by  F,  was  first  defined  by  Wright 
(1922)  as  the  correlation  between  uniting  gametes;  the  definition 
given  here,  which  follows  that  of  Malecot  (1948)  and  Crow  (1954),  is 
equivalent. 

The  degree  of  relationship  expressed  in  the  inbreeding  coefficient 
is  essentially  a  comparison  between  the  population  in  question  and 
some  specified  or  implied  base  population.  Without  this  point  of 
reference  it  is  meaningless,  as  the  following  consideration  will  show. 
On  account  of  the  limitation  in  the  number  of  independent  ancestors 
in  any  population  not  infinitely  large,  all  genes  now  present  at  a  locus 
in  the  population  would  be  found  to  be  identical  by  descent  if  traced 
far  enough  back  into  the  remote  past.  Therefore  the  inbreeding 
coefficient  only  becomes  meaningful  if  we  specify  some  time  in  the 
past  beyond  which  ancestries  will  not  be  pursued,  and  at  which  all 
genes  present  in  the  population  are  to  be  regarded  as  independent — 
that  is,  not  identical  by  descent.  This  point  is  the  base  population 
and  by  its  definition  it  has  an  inbreeding  coefficient  of  zero.  The 
inbreeding  coefficient  of  a  subsequent  generation  expresses  the 
amount  of  the  dispersive  process  that  has  taken  place  since  the  base 
population,  and  compares  the  degree  of  relationship  between  the 
individuals  now,  with  that  between  individuals  in  the  base  population. 
Reference  to  the  base  population  is  not  always  explicitly  stated,  but  is 
always  implied.  For  example,  we  can  speak  of  the  inbreeding  coeffi- 
cient of  a  population  subdivided  into  lines.  The  comparison  of 
relationship  is  between  the  individuals  of  a  line  and  individuals 
taken  at  random  from  the  whole  population.  The  base  population 
implied  is  a  hypothetical  population  from  which  all  the  lines  were 
derived. 

Inbreeding  in  the  idealised  population.  Let  us  now  return  to 
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the  idealised  population  and  deduce  the  coefficient  of  inbreeding  in 
successive  generations,  starting  with  the  base  population  and  its 
progeny  constituting  generation  i.  The  situation  may  be  visualised 
by  thinking  of  a  hermaphrodite  marine  organism,  capable  of  self- 
fertilisation,  shedding  eggs  and  sperm  into  the  sea.  There  are  N 
individuals  each  shedding  equal  numbers  of  gametes  which  unite  at 
random.  All  the  genes  at  a  locus  in  the  base  population  have  to  be 
regarded  as  being  non-identical;  so,  considering  only  one  locus, 
among  the  gametes  shed  by  the  base  population  there  are  zN  different 
sorts,  in  equal  numbers,  bearing  the  genes  Al5  A2,  A3,  etc.  at  the  A 
locus.  The  gametes  of  any  one  sort  carry  identical  genes;  those  of 
different  sort  carry  genes  of  independent  origin.  What  is  the  pro- 
bability that  a  pair  of  gametes  taken  at  random  carry  identical  genes? 
This  is  the  inbreeding  coefficient  of  generation  i .  Any  gamete  has  a 
i/aiVth  chance  of  uniting  with  another  of  the  same  sort,  so  i/zNis  the 
probability  that  uniting  gametes  carry  identical  genes,  and  is  thus  the 
coefficient  of  inbreeding  of  the  progeny.  Now  consider  the  second 
generation.  There  are  now  two  ways  in  which  identical  homo- 
zygotes  can  arise,  one  from  the  new  replication  of  genes  and  the  other 
from  the  previous  replication.  The  probability  of  newly  replicated 
genes  coming  together  in  a  zygote  is  again  i/2N.  The  remaining 
proportion,  i  -  i/zN,  of  zygotes  carry  genes  that  are  independent  in 
their  origin  from  generation  i,  but  may  have  been  identical  in  their 
origin  from  generation  o.  The  probability  of  their  identical  origin  in 
generation  o  is  what  we  have  already  deduced  as  the  inbreeding 
coefficient  of  generation  i.  Thus  the  total  probability  of  identical 
homozygotes  in  generation  2  is 


F>=m+ [*-£?)*> 


where  Fx  and  F2  stand  for  the  inbreeding  coefficients  of  generations 
1  and  2  respectively.  The  same  argument  applies  to  subsequent 
generations,  so  that  in  general  the  inbreeding  coefficient  of  individuals 
in  generation  t  is 

Thus  the  inbreeding  coefficient  is  made  up  of  two  parts:  an  "incre- 
ment," i/zN,  attributable  to  the  new  inbreeding,  and  a  "remainder," 
attributable  to  the  previous  inbreeding  and  having  the  inbreeding 
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coefficient  of  the  previous  generation.  In  the  idealised  population  the 
"new  inbreeding"  arises  from  self-fertilisation,  which  brings  together 
genes  replicated  in  the  immediately  preceding  generation.  Exclusion 
of  self-fertilisation  simply  shifts  the  replication  one  generation 
further  back,  so  that  the  "new  inbreeding"  brings  together  genes 
replicated  in  the  grand-parental  generation;  the  coefficient  of  in- 
breeding is  affected,  but  not  very  much,  as  we  shall  see  later.  The 
distinction  between  "new"  and  "old"  inbreeding  brings  clearly  to 
light  a  point  which  we  note  here  in  passing  because  it  will  be  needed 
later  and  is  often  important  in  practice:  if  there  is  no  "new  inbreed- 
ing," as  would  happen  if  the  population  size  were  suddenly  increased, 
the  previous  inbreeding  is  not  undone,  but  remains  where  it  was 
before  the  increase  of  population  size. 

Let  us  call  the  "increment"  or  "new  inbreeding"  AF,  so  that 


AF 


i 
zN 


(5.7) 


Equation  3.6  may  then  be  rewritten  in  the  form 

Ft=AF+(i-^F)Ft-i  (3-8) 

Further  rearrangement  makes  clearer  the  precise  meaning  of  the 
"increment,"  AF. 


AF: 


F±-FU 

'  i  ~F,-i 


(3-9) 


From  the  equation  written  thus  we  see  that  the  "increment,"  AF, 
measures  the  rate  of  inbreeding  in  the  form  of  a  proportionate  increase. 
It  is  the  increase  of  the  inbreeding  coefficient  in  one  generation,  rela- 
tive to  the  distance  that  was  still  to  go  to  reach  complete  inbreeding. 
This  measure  of  the  rate  of  inbreeding  provides  a  convenient  way  of 
going  beyond  the  restrictive  simplifications  of  the  idealised  popula- 
tion, and  it  thus  provides  a  means  of  comparing  the  inbreeding  effects 
of  different  breeding  systems.  When  the  inbreeding  coefficient  is 
expressed  in  terms  of  AF,  equation  3.8  is  valid  for  any  breeding  system 
and  is  not  restricted  to  the  idealised  population,  though  only  in  the 
idealised  population  is  AF  equal  to  1/2N. 

So  far  we  have  done  no  more  than  relate  the  inbreeding  coefficient 
in  one  generation  to  that  of  the  previous  generation.  It  remains  to 
extend  equation  3.8  back  to  the  base  population  and  so  express  the 
inbreeding  coefficient  in  terms  of  the  number  of  generations.  This  is 
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made  easier  by  the  use  of  a  symbol,  P,  for  the  complement  of  the 
inbreeding  coefficient,  i  -P,  which  is  known  as  the  panmictic  index. 
Substitution  of  P=  i  -F  in  equation  3.8  gives 

p-=i-AF  {3.10) 

Thus  the  panmictic  index  is  reduced  by  a  constant  proportion  in 
each  generation.  Extension  back  to  generation  t  -  2  gives 

and  extension  back  to  the  base  population  gives 

pt={i-AFypB  (3.11) 

where  P0  is  the  panmictic  index  of  the  base  population.  The  base 
population  is  defined  as  having  an  inbreeding  coefficient  of  o,  and 
therefore  a  panmictic  index  of  1.  The  inbreeding  coefficient  in  any 
generation,  t,  referred  to  the  base  population,  is  therefore 

Ft  =  i-(i-AFY  (3.12) 

The  consequences  of  the  dispersive  process  were  described  earlier 
from  the  viewpoint  of  sampling  variance.  Let  us  now  look  again  at 
them,  applying  the  rate  of  inbreeding  and  the  inbreeding  coefficient 
as  measures  of  the  process.  Strictly  speaking  we  should  refer  still 
to  the  idealised  population,  but  the  equating  of  the  two  viewpoints 
can  be  regarded  as  generally  valid  except  in  some  very  special  and 
unlikely  circumstances  (see  Crow,  1954). 

Variance  of  gene  frequency.  First,  the  variance  of  the  change 
of  gene  frequency  in  one  generation,  taken  from  equation  3.1  and 
expressed  in  terms  of  the  rate  of  inbreeding,  becomes 

<=P-§=M,AF  {3-I3) 

Similarly,  the  variance  of  gene  frequencies  among  the  lines  at 
generation  t,  taken  from  equation  3.2  and  expressed  in  terms  of  the 
inbreeding  coefficient  from  3. 12,  becomes 

of=M.[l-(l-^)'] 
=P&loF  (3-14) 
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Thus  AF  expresses  the  rate  of  dispersion  and  F  the  cumulated  effect 
of  random  drift. 

Genotype  frequencies.  Leaving  fixation  aside  for  the  moment, 
let  us  consider  next  the  genotype  frequencies  in  the  population  as  a 
whole.  The  genotype  frequencies  expressed  in  terms  of  the  variance 
of  gene  frequency  in  equations  3.5  can  be  rewritten  in  terms  of  the 
coefficient  of  inbreeding  from  equation  3.J4.  The  frequency  of  A2A2, 
for  example,  is 

(?)=q%+°%=q2o+P<tioF 

The  genotype  frequencies  expressed  in  this  way  are  entered  in  the 
left-hand  side  of  Table  3.1.  As  was  explained  before,  this  way  of 
writing  the  genotype  frequencies  shows  how  the  homozygotes  in- 

Table  3.1 

Genotype  frequencies  for  a  locus  with  two  alleles,  expressed 
in  terms  of  the  inbreeding  coefficient,  F. 


Original            Change 

fre-                 due  to 

quencies           inbreeding 

Origin: 
Independent            Identical 

M, 

Pi         +       M/ 

or 

Pl(i  -F)       +       PaF 

AXA2 

A2A2 

2M0       -       2/XtfoF 
Qo          +       P0Q0F 

or 
or 

2/>o?o(i  -F) 
sKi  -F)       +       1oF 

crease  at  the  expense  of  the  heterozygotes.  Recognition  of  identity 
by  descent  to  which  the  inbreeding  viewpoint  led  us  means  that  we 
can  now  distinguish  the  two  sorts  of  homozygote,  identical  and 
independent,  among  both  the  A1A1  or  A2A2  genotypes.  The  fre- 
quency of  identical  homozygotes  among  both  genotypes  together  is 
by  definition  the  inbreeding  coefficient,  F;  and  it  is  clear  that  the 
division  between  the  two  genotypes  is  in  proportion  to  the  initial 
gene  frequencies.  So  p$F  is  the  frequency  of  AXAX  identical  homo- 
zygotes, and  q0F  that  of  A2A2  identical  homozygotes.  The  remaining 
genotypes,  both  homozygotes  and  heterozygotes,  carry  genes  that  are 
independent  in  origin  and  are  therefore  the  equivalent  of  pairs  of 
gametes  taken  at  random  from  the  population  as  a  whole.  Their 
frequencies  are  therefore  the  Hardy- Weinberg  frequencies.  Thus, 
from  the  inbreeding  viewpoint,  we  arrive  at  the  genotype  frequencies 
shown  in  the  right-hand  columns  of  Table  3.1.  This  way  of  writing 
the  genotype  frequencies  shows  how  homozygotes  are  divided  be- 


66  SMALL  POPULATIONS:  I  [Chap.  3 

tween  those  of  independent  and  those  of  identical  origin.  The 
equivalence  of  the  two  ways  of  expressing  the  genotype  frequencies 
can  be  verified  from  their  algebraic  identity.  Both  ways  show  equally 
clearly  how  the  heterozygotes  are  reduced  in  frequency  in  proportion 
to  i  -F.  The  term  "heterozygosity"  is  often  used  to  express  the 
frequency  of  heterozygotes  at  any  time,  relative  to  their  frequency  in 
the  base  population.  The  heterozygosity  is  the  same  as  the  panmic- 
tic  index,  P.  Thus  if  Ht  and  H0  are  the  frequencies  of  heterozygotes 
for  a  pair  of  alleles  at  generation  t  and  in  the  base  population  res- 
pectively, then  the  heterozygosity  at  generation  t  is 

§=P*  (3-15) 

Fixation.  There  is  little  to  add,  from  the  inbreeding  viewpoint,  to 
the  description  of  fixation  given  earlier.  The  rate  of  fixation — that 
is  the  proportion  of  unfixed  loci  that  become  fixed  in  any  generation — 
is  equal  to  AF,  after  the  steady  phase  has  been  reached  and  the  dis- 
tribution of  gene  frequencies  has  become  flat.  The  quantity  P  in 
equations  3.3  which  give  the  probability  of  a  gene  having  become 
fixed  or  lost,  is  equal  to  1  -F.  We  may  note,  however,  that  the 
probability  of  fixation  is  not  very  different  from  the  inbreeding 
coefficient  itself.  The  explanation  comes  more  readily  by  considering 
the  probability  that  a  locus  remains  unfixed.  This  probability  was 
given  in  equation  3.3  for  a  locus  with  two  alleles  after  enough  genera- 
tions have  passed  to  take  the  population  into  the  steady  phase. 
Expressed  in  terms  of  the  inbreeding  coefficient,  from  equation  3. 12, 
it  is  6p0q0(i  -F).  Now,  the  value  of  p0q0  does  not  change  very  much 
over  quite  a  wide  range  of  gene  frequencies,  and  so  the  probability 
that  a  locus  is  still  unfixed  is  not  very  sensitive  to  the  initial  gene 
frequency.  The  value  of  6p0q0  lies  between  i-o  and  1-5  over  a  range 
of  gene  frequency  from  0-2  to  o-8,  a  range  that  is  likely  to  cover  many 
situations.  Consequently  the  probability  that  a  line  still  segregates, 
or  the  proportion  of  loci  expected  to  remain  unfixed,  is  likely  to  lie 
between  (1  -F)  and  1-5(1  -F).  Thus  the  inbreeding  coefficient  gives 
a  good  idea  of  the  approximate  probability  of  fixation,  even  in  the 
absence  of  a  knowledge  of  the  initial  gene  frequencies.  That  the 
approximation  may  be  quite  close  enough  for  practical  purposes  may 
be  seen  by  taking  a  specific  example.  In  work  involving  immuno- 
logical reactions  it  may  be  necessary  to  produce  a  strain  in  which  all 
loci  that  determine  the  reactions  have  been  fixed.    One  therefore 
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wants  to  know  the  inbreeding  coefficient  necessary  to  raise  the 
probability  of  fixation,  or  the  proportion  of  loci  expected  to  be  fixed, 
to  a  certain  level — say  90  per  cent.  The  inbreeding  coefficient  needed 
to  do  this  would,  on  the  above  considerations,  lie  between  0-90  and 
0-93,  and  this  would  answer  the  question  with  quite  enough  accuracy 
for  most  purposes. 


CHAPTER  4 

SMALL  POPULATIONS: 

II.  Less  Simplified  Conditions 

In  order  to  simplify  the  description  of  the  dispersive  process  we 
confined  our  attention  in  the  last  chapter  to  an  idealised  population, 
and  to  do  this  we  had  to  specify  a  number  of  restrictive  conditions, 
which  could  seldom  be  fulfilled  in  real  populations.  The  purpose  of 
this  chapter  is  to  adapt  the  conclusions  of  the  last  chapter  to  situations 
in  which  the  conditions  imposed  do  not  hold;  in  other  words  to 
remove  the  more  serious  restrictions  and  bring  the  conclusions  closer 
to  reality.  The  restrictive  conditions  were  of  two  sorts,  one  sort 
being  concerned  with  the  breeding  structure  of  the  population  and 
the  other  excluding  mutation,  migration,  and  selection  from  con- 
sideration. We  shall  first  describe  the  effects  of  deviations  from  the 
idealised  breeding  structure,  and  then  consider  the  outcome  of  the 
dispersive  process  when  mutation,  migration,  or  selection  are  oper- 
ating at  the  same  time. 

Effective  Population  Size 

If  the  breeding  structure  does  not  conform  to  that  specified  for 
the  idealised  population,  it  is  still  possible  to  evaluate  the  dispersive 
process  in  terms  of  either  the  variance  of  gene  frequencies  or  the  rate 
of  inbreeding.  This  can  be  done  by  the  same  general  methods  and 
no  new  principles  are  involved.  We  shall  therefore  give  the  con- 
clusions briefly  and  without  detailed  explanation.  The  most  con- 
venient way  of  dealing  with  any  particular  deviation  from  the 
idealised  breeding  structure  is  to  express  the  situation  in  terms  of 
the  effective  number  of  breeding  individuals,  or  the  effective  population 
size.  This  is  the  number  of  individuals  that  would  give  rise  to  the 
sampling  variance  or  the  rate  of  inbreeding  appropriate  to  the  con- 
ditions under  consideration,  if  they  bred  in  the  manner  of  the 
idealised  population.  Thus,  by  converting  the  actual  number,  N,  to 
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the  effective  number,  Ne,  we  can  apply  the  formulae  deduced  in  the 
last  chapter.  The  rate  of  inbreeding,  for  example,  is 


AF= 


zN. 


(4.1) 


just  as  for  the  idealised  population  AF=  ijzN  (equation  3.7). 

The  relationships  between  actual  and  effective  numbers  in  the 
situations  most  commonly  met  with  are  given  below.  The  exact 
expressions  are  often  complicated,  but  in  most  circumstances  an 
approximation  can  be  used  with  sufficient  accuracy.  We  should  first 
note  that  the  actual  number,  Nt  refers  to  breeding  individuals — the 
breeding  individuals  of  one  generation — and  it  therefore  cannot  be 
obtained  directly  from  a  census,  unless  the  different  age-groups  are 
distinguished. 

Bisexual  organisms:  self-fertilisation  excluded.  The  ex- 
clusion of  self-fertilisation  makes  very  little  difference  to  the  rate  of 
inbreeding,  unless  N  is  very  small,  as  with  close  inbreeding.  The 
relationship  of  effective  to  actual  numbers  (Wright,  1 931)  is 


Ne=N+i 


and  the  rate  of  inbreeding  is 


AF= 


2N+1 


(approx.) 


(approx.) 


.(4.2) 


(4.3) 


The  exact  expression  for  the  inbreeding  coefficient  in  a  bisexual 
population,  and  its  derivation,  are  given  by  Malecot  (1948). 

Different  numbers  of  males  and  females.  In  domestic  and 
laboratory  animals  the  sexes  are  often  unequally  represented  among 
the  breeding  individuals,  since  it  is  more  economical,  when  possible, 
to  use  fewer  males  than  females.  The  two  sexes,  however,  whatever 
their  relative  numbers,  contribute  equally  to  the  genes  in  the  next 
generation.  Therefore  the  sampling  variance  attributable  to  the  two 
sexes  must  be  reckoned  separately.  Since  the  sampling  variance  is 
proportional  to  the  reciprocal  of  the  number,  the  effective  number  is 
twice  the  harmonic  mean  of  the  numbers  of  the  two  sexes  (Wright, 
1 931),  so  that 


1  1 

+ 


Ne    iNm'4Nf 


(44) 
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where  Nm  and  Nf  are  the  actual  numbers  of  males  and  females 
respectively.  The  rate  of  inbreeding  is  then 

AF=sk+m  <approx->     (*5) 

This  gives  a  close  enough  approximation  unless  both  Nm  and  Nf  are 
very  small,  as  with  close  inbreeding.  It  should  be  noted  that  the  rate 
of  inbreeding  depends  chiefly  on  the  numbers  of  the  less  numerous 
sex.  For  example,  if  a  population  were  maintained  with  an  in- 
definitely large  number  of  females  but  only  one  male  in  each  genera- 
tion, the  effective  number  would  be  only  about  4. 

Unequal  numbers  in  successive  generations.  The  rate  of 
inbreeding  in  any  one  generation  is  given,  as  before,  by  i/zN.  If  the 
numbers  are  not  constant  from  generation  to  generation,  then  the 
mean  rate  of  inbreeding  is  the  mean  value  of  i/zN'm  successive  genera- 
tions. The  effective  number  is  the  harmonic  mean  of  the  numbers  in 
each  generation  (Wright,  1939).  Over  a  period  of  /  generations, 
therefore, 

we=1tlk+k+k+  -  +w]  (approx-}  {4-6) 

Thus  the  generations  with  the  smallest  numbers  have  the  most  effect. 
The  reason  for  this  can  be  seen  by  consideration  of  the  "new"  and 
"old"  inbreeding  referred  to  in  connexion  with  equation  3.6.  An 
expansion  in  numbers  does  not  affect  the  previous  inbreeding;  it 
merely  reduces  the  amount  of  new  inbreeding.  So,  in  a  population 
with  fluctuating  numbers  the  inbreeding  proceeds  by  steps  of  varying 
amount,  and  the  present  size  of  the  population  indicates  only  the 
present  rate  of  inbreeding. 

Non-random  distribution  of  family  size.  This  is  probably  the 
commonest  and  most  important  deviation  from  the  breeding  system 
of  the  idealised  population.  Its  consequence  is  usually  to  render  the 
effective  number  less  than  the  actual,  but  in  special  circumstances  it 
makes  it  greater.  Family  size  means  here  the  number  of  progeny  of 
an  individual  parent  or  of  a  pair  of  parents,  that  survive  to  become 
breeding  individuals.  It  will  be  remembered  that  each  breeding 
individual  in  the  idealised  population  contributes  equally  to  the  pool 
of  gametes,  and  therefore  equally  also  to  the  potential  zygotes  in  the 
next  generation.  Survival  of  zygotes  is  random.  The  mean  number 
of  progeny  surviving  to  breeding  age  is  1  for  individual  parents  and  2 
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for  pairs  of  parents.  Since  the  chance  of  survival  for  any  particular 
zygote  is  small,  the  variation  of  family  size  follows  a  Poisson  distribu- 
tion. The  variance  of  family  size  is  therefore  equal  to  the  mean  family 
size,  equality  of  mean  and  variance  being  a  property  of  the  Poisson 
distribution.  Thus  in  a  population  of  bisexual  organisms,  in  which 
all  other  conditions  of  the  idealised  population  are  satisfied,  family 
size  will  have  a  mean  and  a  variance  of  2.  In  natural  populations  the 
mean  is  not  likely  to  differ  much  from  2,  but  the  variance  must  be 
expected  to  be  usually  greater,  for  reasons  of  differing  fertility  be- 
tween the  parent  individuals  and  differing  viability  between  the 
families.  If  the  variance  of  family  size  is  increased,  a  greater  propor- 
tion of  the  following  generation  will  be  the  progeny  of  a  smaller 
number  of  parents,  and  the  effective  number  of  parents  will  be  less 
than  the  actual  number.  Conversely,  if  the  variance  of  family  size  is 
reduced  below  that  of  the  idealised  population,  the  effective  number 
will  be  greater  than  the  actual  number.  It  can  be  shown  that,  when 
the  mean  family  size  is  2,  the  effective  number  is  as  follows  (Wright, 
1940;  Crow,  1954): 


Ne  = 


4iV 
2  +  0I 


(4-7) 


where  erf  is  the  variance  of  family  size.  (Strictly  speaking  this  is  the 
effective  number  as  it  affects  variance  of  gene  frequency  and  fixation: 

for  its  effects  on  the  inbreeding  coefficient,  Ne=- % .  The  differ- 
ence is  small  and  we  shall  ignore  it.)  Thus,  when  there  is  equal 
fertility  of  the  parents  and  random  survival  of  the  progeny  of  —  2,  and 
Ne=N.  When  differences  of  fertility  and  viability  make  of  greater 
than  2,  as  in  most  actual  populations,  then  Ne  is  less  than  N.  The 
effective  number  under  consideration  here  refers  to  a  population  with 
equal  numbers  of  males  and  females,  and  with  monogamous  mating. 
If  males  are  not  restricted  to  a  single  mate,  then  the  families  of  males 
are  likely  to  be  more  variable  in  size  than  those  of  females.  In  these 
circumstances  the  relationship  of  effective  to  actual  numbers  will 
differ  for  male  and  female  parents. 

It  is  possible  by  controlled  breeding  to  make  the  variance  of 
family  size,  of,  less  than  2,  and  therefore  to  make  the  effective 
number  greater  than  the  actual.  If  two  members  of  each  family  are 
deliberately  chosen  to  be  parents  of  the  next  generation,  then  the 
variance  of  family  size  is  zero.    Under  these  special  circumstances, 

F  F.Q.G. 
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and  if  the  sexes  are  equal  in  numbers,  the  effective  number  is  twice 
the  actual: 

N.  =  zN  (4.8) 

The  rate  of  inbreeding  is  consequently  half  what  it  would  be  in  an 
idealised  population  of  equal  size,  and  is  usually  less  than  half  the 
rate  of  inbreeding  under  normal  circumstances  and  random  mating. 
Under  this  controlled  breeding  system  the  rate  of  inbreeding  is  the 
lowest  possible  with  a  given  number  of  breeding  individuals.  The 
reduced  variance  of  family  size  is  the  path  through  which  the  ' 'de- 
liberate avoidance  of  inbreeding"  works.  The  problem  often  arises 
of  keeping  a  stock  with  minimum  inbreeding,  but  with  a  limitation  of 
the  actual  population  size  imposed  by  the  space  or  facilities  available. 
A  common  practice  under  these  circumstances  is  the  deliberate 
avoidance  of  sib-matings  and  perhaps  also  of  cousin-matings.  One 
may  go  further  and  by  the  use  of  pedigrees  (in  the  manner  described 
in  the  next  chapter)  choose  pairs  for  mating  that  have  the  least 
possible  relationship  with  each  other.  Deliberate  avoidance  of  in- 
breeding in  this  way  has  the  effect  of  distributing  the  individuals 
chosen  to  be  parents  evenly  over  the  available  families,  and  thus 
reduces  the  variance  of  family  size  and  the  rate  of  inbreeding.  The 
same  result,  however,  can  be  achieved  with  less  labour  simply  by 
ensuring  that  the  available  families  are  as  far  as  possible  equally 
represented  among  the  individuals  chosen  to  be  the  parents  of  the 
next  generation.  If,  in  addition,  matings  between  close  relatives  are 
avoided,  the  inbreeding  coefficient  in  any  generation  is  slightly  lower 
and  is  more  uniform  between  the  individuals  in  the  generation  than  if 
matings  between  close  relatives  are  allowed;  but  the  rate  of  inbreeding 
is  the  same. 

If  the  sexes  are  unequal  in  numbers,  but  the  individuals  chosen  as 
parents  are  equally  distributed,  in  numbers  and  sexes,  between  the 
families,  so  that  the  variance  of  family  size  is  still  zero,  then  the  rate 
of  inbreeding  is  given  by  the  following  formula  (Gowe,  Robertson, 
and  Latter,  1959): 

where  Nm  and  Nf  are  the  actual  numbers  of  male  and  female  parents 
respectively,  and  females  are  more  numerous  than  males. 
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Example  4.1.  Several  flocks  of  poultry  in  the  United  States  and  in 
Canada,  which  are  used  as  controls  for  breeding  experiments,  are  main- 
tained by  the  following  breeding  system  (Gowe,  Robertson,  and  Latter, 

1959)- 

There  are  50  breeding  males  and  250  breeding  females  in  each  genera- 
tion. Every  male  is  the  son  of  a  different  father,  and  every  female  the 
daughter  of  a  different  mother,  so  that  the  variance  of  family  size  is  zero. 
One  of  the  objectives  of  this  breeding  system  is  to  minimise  the  rate  of 
inbreeding.  Let  us  therefore  find  what  the  rate  of  inbreeding  is,  and  then 
see  how  much  is  achieved  in  this  respect  by  the  deliberate  equalisation  of 
family  size.  By  equation  4.9  the  rate  of  inbreeding  in  these  flocks  is 
AF  =  0-002.  If  there  were  no  deliberate  choice  of  breeding  individuals,  and 
family  size  conformed  to  a  Poisson  distribution,  the  rate  of  inbreeding  by 
equation  4.5  would  be  AF  =  0-003.  Thus,  without  the  deliberate  equalisa- 
tion of  family  size  the  rate  of  inbreeding  would  be  50  per  cent  greater.  If 
a  low  rate  of  inbreeding  were  the  only  objective,  the  number  of  females 
could  be  substantially  reduced  without  much  effect.  For  example,  if 
there  were  no  more  females  than  males,  with  50  of  each  sex  (N=  100)  and 
with  equalisation  of  family  size,  the  rate  of  inbreeding  from  equation  4.8 
would  be  AF= 0-0025,  which  is  not  very  much  greater  than  with  five  times 
as  many  females.  This  illustrates  the  point,  mentioned  earlier,  that  most 
of  the  inbreeding  comes  from  the  less  numerous  sex. 


Ratio  of  effective  to  actual  number.  When  matings  are  con- 
trolled and  pedigree  records  kept,  the  rate  of  inbreeding  can  readily 
be  computed,  as  will  be  explained  in  the  next  chapter.  But  pedigree 
records  are  not  available  for  natural  populations,  nor  for  laboratory 
populations  kept  by  mass  culture,  as  for  example  Drosophila  popul- 
tions.  How  are  we  to  estimate  the  rate  of  inbreeding  in  such  popula- 
tions? We  know  the  effective  number  is  likely  to  be  less  than  the 
actual,  but  how  much  less  ?  To  estimate  the  effective  number  requires 
a  special  experiment,  and  only  the  actual  number  is  likely  to  be  known. 
Determinations  of  the  ratio  of  effective  to  actual  numbers,  Ne/N, 
from  data  on  man,  Drosophila,  and  the  snail  Lymnaea,  led  to  values 
ranging  from  70  per  cent  to  95  per  cent  (Crow  and  Morton,  1955). 
In  the  absence  of  specific  knowledge,  therefore,  it  would  seem 
reasonable  to  take  the  effective  number  as  being,  very  roughly,  about 
three-quarters  of  the  actual  number.  There  are  two  methods  by 
which  the  ratio  NJN  may  be  determined:  (1)  by  the  estimation  of  the 
variance  of  family  size,  which  yields  Ne  by  equation  4.7  (though 
adjustment  has  to  be  made  if  the  mean  family  size  at  the  time  of 
measurement  is  not  2);  and  (2)  by  the  estimation  of  the  variance  of  the 
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changes  of  gene  frequency  during  inbreeding,  which  yields  Ne  by 
equation  3.1.  Both  methods  have  been  applied  to  Drosophila 
melanogaster  in  laboratory  cultures.  The  ratio  Ne/N  for  female 
parents  was  71  per  cent  by  the  first  method  and  76  per  cent  by  the 
second;  and  for  male  parents,  48  per  cent  and  35  per  cent  (Crow  and 
Morton,  1955).  The  ratio  NJN  for  the  sexes  jointly,  determined  by 
the  second  method,  ranged  from  56  per  cent  to  83  per  cent,  with  a 
mean  of  70  per  cent,  in  five  experiments  with  equal  actual  numbers 
of  males  and  females  (Kerr  and  Wright,  19540,  b;  Wright  and  Kerr, 
1954;  Buri,  1956).  The  low  value  of  56  per  cent  was  found  in  rather 
poor  culture  conditions  of  crowding,  where  there  was  more  compe- 
tition (Buri,  1956). 

Example  4.2.  As  an  illustration  of  the  use  of  the  ratio  NJN  let  us  find 
the  expected  rate  of  inbreeding  in  a  population  of  Drosophila  maintained 
by  20  pairs  of  parents  in  each  generation.  The  actual  number  is  TV  =  40. 
If  the  effective  number  were  equal  to  the  actual,  the  rate  of  inbreeding,  by 
equation  4.  J,  would  be  AF=  1/80  =  1  -25  per  cent.  If  we  take  Ne  =  o-yN,  from 
the  experimental  results  cited  above,  then  iVe  =  28,  and  the  rate  of  in- 
breeding is  AF=  1/56  =  1 786  per  cent.  The  coefficient  of  inbreeding  after 
10,  50,  and  100  generations  would  then  be  (by  equation  3.12)  17  per  cent, 
59  per  cent,  and  84  per  cent. 


Migration,  Mutation,  and  Selection 

The  description  of  the  dispersive  process  given  so  far  in  this 
chapter  and  the  previous  one  is  conditional  on  the  systematic  pro- 
cess of  mutation,  migration,  and  selection  being  absent,  and  its  rele- 
vance to  real  populations  is  therefore  limited.  So  let  us  now  consider 
the  effects  of  the  dispersive  and  systematic  processes  when  acting 
jointly.  The  systematic  processes,  as  we  have  seen  in  Chapter  2, 
tend  to  bring  the  gene  frequencies  to  stable  equilibria  at  particular 
values  which  would  be  the  same  for  all  populations  under  the  same 
conditions.  The  dispersive  process,  in  contrast,  tends  to  scatter  the 
gene  frequencies  away  from  these  equilibrium  values,  and  if  not  held 
in  check  by  the  systematic  processes  it  would  in  the  end  lead  to  all 
genes  being  either  fixed  or  lost  in  all  populations  not  infinite  in  size. 
The  tendency  of  the  systematic  processes  to  change  the  gene  fre- 
quency toward  its  equilibrium  value  becomes  stronger  as  the  fre- 
quency deviates  further  from  this  value.  For  this  reason  the  opposing 
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tendencies  of  the  dispersive  and  systematic  processes  reach  a  point  of 
balance:  a  point  at  which  the  dispersion  of  the  gene  frequencies  is 
held  in  check  by  the  systematic  processes.  When  this  point  of  balance 
is  reached  there  will  be  a  certain  degree  of  differentiation  between 
sub-populations,  but  it  will  neither  increase  nor  decrease  so  long  as 
the  conditions  remain  unchanged.  The  problem  is  therefore  to  find 
the  distribution  of  gene  frequencies  among  the  lines  of  a  subdivided 
population  when  this  steady  state  has  been  reached.  The  solution  is 
complicated  mathematically,  and  we  shall  give  only  the  main  con- 
clusions, explaining  their  meaning  but  not  their  derivation.  For 
details  of  the  joint  action  of  the  dispersive  and  systematic  processes, 
see  Wright  (193 1,  1942,  1948,  195 1). 

Mutation  and  migration.  Mutation  and  migration  can  be 
dealt  with  together  because  they  change  the  gene  frequency  in  the 
same  manner.  Consider  again  a  population  subdivided  into  many 
lines,  all  with  an  effective  size  Ne\  and  let  a  proportion,  m,  of  the 
breeding  individuals  of  every  generation  in  each  line  be  immigrants 
coming  at  random  from  all  other  lines.  Consider  two  alleles  at  a 
locus,  with  mean  frequencies  p  and  q  in  the  population  as  a  whole,  and 
with  mutation  rates  u  and  v  in  the  two  directions.  Then,  when  the 
balance  between  dispersion  on  the  one  hand  and  mutation  and 
migration  on  the  other  is  reached,  the  variance  of  the  gene  frequency 
among  the  lines  is  given  by  the  following  expression  (Wright,  1931; 
Malecot,  1948): 


pq 


1  +  \Ne(u  +  v  +  m) 


(approx.) 


,{4.10) 


The  degree  of  dispersion  represented  here  by  the  variance  of  the  gene 
frequency  can  also  be  expressed  as  a  coefficient  of  inbreeding,  by 
putting  o\  =Fpq,  from  equation  3. 14.  Then 


v  + 


(approx.) 


.(4.11) 


i+4Ne(u 

The  theoretical  distributions  of  the  gene  frequency  appropriate  to 
four  different  values  of  F,  when  the  mean  gene  frequency  is  0-5,  are 
shown  in  Fig.  4. 1 .  These  distributions  show  how  high  F  must  be  for 
there  to  be  a  substantial  amount  of  fixation  or  of  differentiation  be- 
tween sub-populations.  What  the  distributions  depict  can  be  stated 
in  three  ways:  (a)  If  we  had  a  large  number  of  sub-populations  and  we 
determined  the  frequency  of  a  particular  gene  in  all  of  them,  the  dis- 
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tribution  curve  is  what  we  should  obtain  by  plotting  the  percentage 
of  sub-populations  showing  each  gene  frequency.  Or,  in  other  words, 
the  height  of  the  curve  at  a  particular  gene  frequency  shows  the 
probability  of  finding  that  gene  frequency  in  any  one  sub-population. 
(b)  If  we  had  one  sub-population  and  measured  the  gene  frequencies 
at  a  large  number  of  loci,  all  of  which  started  with  the  same  initial 
frequency,  the  curve  is  the  distribution  of  frequencies  that  we  should 

find,  (c)  If  we  had  one  sub- 
population  and  measured  the 
frequency  of  one  particular  gene 
repeatedly  over  a  long  period  of 
time,  the  curve  is  the  distribu- 
tion of  frequencies  that  we 
should  find.  The  distributions 
describe  the  state  of  affairs  when 
equilibrium  between  the  sys- 
tematic and  dispersive  pro- 
cesses has  been  reached,  and 
the  population  as  a  whole  is  in 
a  steady  state.  From  the  dis- 
tributions shown  in  Fig.  4.1  it 
will  be  seen  that  when  F  is  0-005 
there  is  very  little  differentia- 
tion, and  when  F  is  0-048  there 
is  a  fair  amount  of  differentia- 
tion but  still  no  fixation.  When 
F  is  0-333  tne  distribution  is  flat, 
which  means  that  all  gene  fre- 
quencies are  equally  probable 
(including  o  and  1);  thus  there 
is  much  differentiation,  and  in 
addition  a  substantial  amount 
of  fixation  and  loss  occurs. 
When  F  exceeds  this  critical  value  intermediate  gene  frequencies 
become  rarer,  and  a  greater  proportion  of  sub-populations  have  the 
gene  either  fixed  or  lost.  When  mutation  or  migration  occurs,  fixation 
or  loss  is  not  a  permanent  state  in  any  one  sub-population;  the  amount 
of  fixation  or  loss  is  what  would  be  found  at  any  one  time. 

Let  us  return  now  to  the  expression,  4. jt,  relating  the  coefficient 
of  inbreeding  to  the  rates  of  mutation  and  migration   when  the 


Fig.  4.1.  Theoretical  distributions 
of  gene  frequency  among  sub- 
populations,  when  dispersion  is 
balanced  by  mutation  or  migration. 
The  states  of  dispersion  to  which 
the  curves  refer  are  indicated  by  the 
values  of  F  in  the  figure.  (Redrawn 
from  Wright,  195 1.) 
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population  has  reached  the  steady  state;  and  let  us  consider  the  rates 
of  mutation  or  migration,  in  relation  to  the  effective  population  size, 
that  would  just  allow  the  dispersive  process  to  go  to  the  critical  point 
corresponding  to  the  value  of  ^=0-333.  Putting  this  value  of  F  in 
equation  4.11  yields 


(4.12) 


u  +  v  +  m=-^-r     (approx.) 

First  let  us  consider  mutation  alone.  If  the  sum  of  the  mutation 
rates  in  the  two  directions  (u  +  v)  were  io-5,  which  is  a  realistic  value 
to  take  according  to  what  is  known  of  mutation  rates,  then  the  critical 
state  of  dispersion  will  be  reached  in  sub-populations  of  effective  size 
Ne  =  50,000.  In  other  words,  mutation  rates  of  this  order  of  magni- 
tude will  arrest  the  dispersive  process  before  the  critical  state  only  in 
populations  with  effective  numbers  greater  than  50,000.  Populations 
smaller  than  this  will  show  a  substantial  amount  of  fixation  of  genes 
having  this  mutation  rate.  In  practice,  therefore,  mutation  may  be 
discounted  as  a  force  opposing  dispersion  in  populations  that  would 
commonly  be  regarded  as  "small";  populations,  that  is,  with  effective 
numbers  of  the  order  of  100,  or  even  1,000. 

With  migration  the  picture  is  different,  because  what  would  be 
considered  a  high  rate  of  mutation  would  be  judged  a  low  rate  of 
migration.  The  critical  value  of  F= 0-333  w^  occur  when  m  =  ijzNe. 
With  this  rate  of  migration  there  would  be  only  one  immigrant 
individual  in  every  second  generation,  irrespective  of  the  population 
size.  Thus  we  see  that  only  a  small  amount  of  interchange  between 
sub-populations  will  suffice  to  prevent  them  from  differentiating 
appreciably  in  gene  frequency. 

The  situation  to  which  this  consideration  of  migration  refers  is 
known  as  the  * 'island  model."  It  pictures  a  discontinuous  population 
such  as  might  be  found  inhabiting  widely  separated  islands,  inter- 
change taking  place  by  occasional  migrants  from  one  sub-population 
to  another.  But  differentiation  of  sub-populations  by  random  drift 
can  take  place  also  in  a  continuous  population  if  the  motility  of  the 
organism  is  small  in  relation  to  the  population  density.  This  is  known 
as  "isolation  by  distance"  or  the  "neighbourhood  model"  (Wright, 
1940;  1943;  1946;  1 951).  Clearly,  if  there  is  little  dispersal  over  the 
territory  between  one  generation  and  the  next  the  choice  of  mates  is 
restricted  and  mating  cannot  be  at  random.  The  population  is  then 
subdivided  into   "neighbourhoods"   (Wright,    1946)   within  which 
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individuals  find  mates.  A  neighbourhood  is  an  area  within  which 
mating  is  effectively  random.  The  size  of  a  neighbourhood  depends 
on  the  distance  covered  by  dispersal  between  one  generation  and  the 
next.  If  the  distances  between  localities  inhabited  by  offspring  and 
parents  at  corresponding  stages  of  the  life  cycle  are  distributed  with  a 
variance  oj,  then  the  area  of  a  neighbourhood  is  the  area  enclosed  by  a 
circle  of  radius  2ody  which  is  7r(2crd)2.  The  effective  population  size 
of  a  neighbourhood  is  the  number  of  breeding  individuals  in  the 
area  of  a  neighbourhood.  The  subdivision  of  a  population  into 
neighbourhoods  leads  to  random  drift,  but  the  amount  of  local 
differentiation  depends  on  the  size  of  the  whole  population  as  well 
as  on  the  effective  number  in  the  neighbourhood.  If  the  whole 
population  is  not  very  much  larger  than  the  neighbourhood  then  the 
whole  population  will  drift,  and  there  will  be  little  local  differentiation 
within  it.  The  conclusion  to  which  the  neighbourhood  model  leads 
is  that  a  great  amount  of  local  differentiation  will  take  place  if  the 
effective  number  in  a  neighbourhood  is  of  the  order  of  20,  and  a 
moderate  amount  if  it  is  of  the  order  of  200;  but  with  larger  neigh- 
bourhoods it  will  be  negligible.  There  will  be  much  more  local 
differentiation  in  a  population  inhabiting  a  linear  territory,  such  as  a 
river  or  shore  line,  because  a  neighbourhood  is  then  open  to  immi- 
gration only  from  two  directions  instead  of  from  all  round.  The  extent 
of  a  neighbourhood  in  a  population  distributed  in  one  dimension  is 
the  square  root  of  the  area  of  a  neighbourhood  in  a  population  dis- 
tributed in  two  dimensions.  The  effective  population  size  is  there- 
fore the  number  of  breeding  individuals  in  a  distance  2adJir  of  terri- 
tory. 

Example  4.3.  As  an  illustration  of  the  computation  of  the  effective 
population  size  of  a  neighbourhood  we  may  take  some  observations  from 
the  detailed  studies  by  Lamotte  (195 1)  of  the  snail  Cepaea  nemoralis  in 
France.  Marked  individuals  were  released  in  spring  and  the  distance 
travelled  from  the  point  of  release  by  those  recaptured  in  the  autumn  was 
noted.  Since  the  snails  are  inactive  in  winter  this  represented  the  dis- 
placement occurring  in  one  year.  The  mean  displacement  was  8-i  metres, 
and  its  standard  deviation  9-4  m.  The  standard  deviation  of  the  displace- 
ment between  birth  and  mating,  which  usually  takes  place  in  the  second 
year  of  life,  was  estimated  as  0-^  =  15  m.  The  area  occupied  by  a  neigh- 
bourhood is  therefore  7r(2ud)2  =  12-50-1  =  2,813  sq.  m.  The  density  of  in- 
dividuals in  two  large  colonies  was  found  to  be  2  per  sq.  m.,  and  in  another 
3  per  sq.  m.  The  effective  population  size  of  the  neighbourhoods  in  these 
colonies  was  therefore  about  5,600  and  8,400.    These  figures  are  a  good 
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deal  larger  than  the  size  of  neighbourhoods  from  which  we  would  expect 
differentiation  within  the  colonies.  Five  colonies  inhabiting  linear  terri- 
tories had  densities  ranging  from  4-5  to  20  individuals  per  metre.  The 
effective  population  size  of  the  neighbourhoods  in  these  colonies  ranged 
from  236  to  1,050.  These  are  approaching  the  size  from  which  differentia- 
tion within  a  colony  would  be  expected. 


Selection.  Selection  operating  on  a  locus  in  a  large  population 
brings  the  gene  frequency  to  an  equilibrium;  when  selection  against 
a  recessive  or  semidominant  gene  is  balanced  by  mutation  the 
equilibrium  is  at  a  low  gene  frequency,  and  when  selection  favours 
the  heterozygote  the  equilibrium 
is  more  likely  to  be  at  an  inter- 
mediate frequency.  The  question 
we  have  now  to  consider  is:  How 
much  can  the  dispersive  process 
disturb  these  equilibria  and  cause 
small  populations  to  deviate  from 
the  point  of  equilibrium?  The 
importance  of  this  question  lies 
in  the  fact  that  an  increase  of  the 
frequency  of  a  deleterious  gene 
will  reduce  the  fitness — that  is, 
will  increase  the  frequency  of 
"genetic  deaths" — and  the  dis- 
persive process  may  therefore 
lead  to  non-adaptive  changes  in 
small  populations.  We  shall  not 
attempt  to  cover  the  joint  effects 
of  selection  and  dispersion  in 
detail,  but  shall  merely  illustrate 
their  general  nature  by  reference 
to  a  particular  case  of  selection 

against  a  recessive  gene  balanced  by  mutation.  The  effects  of  selec- 
tion in  favour  of  heterozygotes  will  be  discussed  in  the  next  chapter, 
because  they  have  more  importance  in  connexion  with  close  in- 
breeding. 

Fig.  4.2  shows  the  state  of  dispersion  of  a  gene  among  sub- 
populations  of  three  sizes  under  the  following  conditions.  Mutation 
is  supposed  to  be  the  same  in  both  directions,  and  the  coefficient  of 


Fig.  4.2.  Theoretical  distributions 
of  gene  frequency  among  sub- 
populations  when  the  dispersion  is 
balanced  by  mutation  and  selection. 
The  graphs  refer  to  a  recessive  gene 
with  u=v  =-£qS,  in  populations  of 
size:  (a)  Ne=  50 Is,  (b)  Ne=$ls,  and 
(c)  Ne  =0-5/5.  (Redrawn  from  Wright, 
1942.) 
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selection  against  the  homozygote  is  supposed  to  be  twenty  times  the 
mutation  rate.  In  a  large  population  the  balance  between  the 
mutation  and  the  selection  would  bring  the  gene  frequency  to  equili- 
brium at  about  0-2.  The  population  sizes  to  which  the  graphs  refer 
are  (a)  Ne  =  50/5,  (b)  Ne  =  5/s,  and  (c)  Ne  =  o-$/s.  If  we  assumed  a  muta- 
tion rate  of  io~5  in  both  directions  then  the  intensity  of  selection  would 
be  s  =  zo  x  io~5,  and  the  effective  population  sizes  to  which  the  graphs 
refer  would  be  (a)  250,000  (b)  25,000  and  (c)  2,500.  These  graphs 
show  that  with  the  largest  value  of  Ne  there  is  little  differentiation 
between  sub-populations;  with  the  intermediate  value  of  Ne  random 
drift  is  strong  enough  to  cause  a  good  deal  of  differentiation;  with  the 
smallest  value  of  Ne  the  effects  of  random  drift  predominate  over  those 
of  mutation  and  selection,  intermediate  gene  frequencies  are  almost 
absent,  and  in  the  majority  of  sub-populations  the  allele  is  either  fixed 
or  lost.  In  this  case,  moreover,  a  fair  proportion  of  the  sub-populations 
have  the  deleterious  allele  fixed  in  them.  This  illustrates  how  random 
drift  can  overcome  relatively  weak  selection  and  lead  to  fixation  of  a 
deleterious  gene. 

This  particular  case  illustrates  in  principle  what  will  happen  when 
the  processes  of  random  drift,  selection,  and  mutation  are  all  operating. 
But  we  need  to  have  some  idea  of  how  intense  the  selection  must  be 
before  it  overcomes  the  effects  of  random  drift.  If  we  are  content  not 
to  be  very  precise  we  can  say  that  selection  begins  to  be  more  im- 
portant than  random  drift  when  the  coefficient  of  selection,  s,  is  of  the 
order  of  magnitude  of  1  j^Ne.  For  example,  in  a  population  of  effective 
size  100,  the  critical  value  of  s  would  be  about  0-0025.  This  is  a  very 
low  intensity  of  selection,  quite  beyond  the  reach  of  experimental 
detection.  The  conclusion  to  be  drawn,  therefore,  is  that  in  all  but 
very  small  populations,  even  a  very  slight  selective  advantage  of  one 
allele  over  another  will  suffice  to  check  the  dispersive  process  before 
it  causes  an  appreciable  amount  of  fixation  or  of  differentiation  be- 
tween sub-populations. 

Example  4.4.  The  opposing  forces  of  dispersion  and  selection  are 
illustrated  in  Fig.  4.3,  from  an  experiment  with  Drosophila  melanogaster 
(Wright  and  Kerr,  1954).  The  frequency  of  the  sex-linked  gene  "Bar" 
was  followed  for  10  generations  in  108  lines  each  maintained  by  4  pairs  of 
parents.  (On  account  of  the  complication  of  sex-linkage,  which  increases 
the  rate  of  dispersion,  the  theoretical  effective  number  was  6765:  the 
effective  number  as  judged  from  the  actual  rate  of  dispersion  was  Ne  =  4*87.) 
The  initial  gene  frequency  was  0-5.    The  circles  in  the  figure  show  the 
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distribution  of  the  gene  frequency  among  the  lines  in  the  fourth  to  tenth 
generations,  when  the  distribution  had  reached  its  steady  form.  The 
smooth  curve  shows  the  theoretical  distribution  based  on  Ne  =  $  and  a 
coefficient  of  selection  against  Bar  of  5  =  0-17.  Previously  fixed  lines  are 
not  included  in  the  distributions.   Altogether,  at  the  tenth  generation,  95 


0  2  4  6  8 

NUMBER  OF  BAR  GENES 

Fig.  4.3.  Distribution  of  gene  frequencies  under  inbreeding  and 
selection,  as  explained  in  Example  4.4.  (Data  from  Wright  and 
Kerr,  1954.) 

of  the  108  lines  had  become  fixed  for  the  wild-type  allele  and  3  for  Bar 
while  10  remained  unfixed.  Thus,  despite  a  17  per  cent  selective  dis- 
advantage, the  deleterious  allele  was  fixed  in  about  3  per  cent  of  the  lines. 


Random  Drift  in  Natural  Populations 


Having  described  the  dispersive  process  and  its  theoretical  conse- 
quences, we  may  now  turn  to  the  more  practical  question  of  how  far 
these  consequences  are  actually  seen  in  natural  populations.  The 
answering  of  this  question  is  beset  with  difficulties,  and  the  following 
comments  are  intended  more  to  indicate  the  nature  of  these  diffi- 
culties than  to  answer  the  question. 
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The  theory  of  small  populations,  outlined  in  this  and  the  pre- 
ceding chapter,  is  essentially  mathematical  in  nature  and  is  un- 
questionably valid:  given  only  the  Mendelian  mechanism  of  inheri- 
tance, the  conclusions  arrived  at  are  a  necessary  consequence  under 
the  conditions  specified.  The  question  at  issue,  then,  is  whether  the 
conditions  in  natural  populations  are  often  such  as  would  allow  the 
dispersion  of  gene  frequencies  to  become  detectable.  The  pheno- 
mena which  would  be  expected  to  result  from  the  dispersive  process, 
if  the  conditions  were  appropriate,  are  differentiation  between  the 
inhabitants  of  different  localities,  and  differences  between  successive 
generations.  Both  these  phenomena  are  well  known  in  subdivided  or 
small  isolated  populations,  and  it  is  tempting  to  conclude  that  because 
they  are  the  expected  consequences  of  random  drift,  random  drift 
must  be  their  cause.  But  there  are  other  possible  causes:  the  en- 
vironmental conditions  probably  differ  from  one  locality  to  another 
and  from  one  season  to  another;  so  the  intensity,  or  even  the  direction 
of  selection  may  well  vary  from  place  to  place  and  from  year  to  year, 
and  the  differences  observed  could  equally  well  be  attributed  to 
variation  of  the  selection  pressure.  Before  we  can  justifiably  attribute 
these  phenomena  to  random  drift,  therefore,  we  have  to  know  (a) 
that  the  effective  population  size  is  small  enough,  (b)  that  the  sub- 
populations  are  well  enough  isolated  (or  the  size  of  the  '  'neighbour- 
hoods" sufficiently  small),  and  (c)  that  the  genes  concerned  are  subject 
to  very  little  selection. 

The  estimation  of  the  present  size  of  a  population,  though  not  tech- 
nically easy,  presents  no  difficulties  of  principle.  But  the  present 
state  of  differentiation  depends  on  the  population  size  in  the  past, 
and  this  can  generally  only  be  guessed  at.  It  is  difficult  to  know  how 
often  the  population  may  have  been  drastically  reduced  in  size  in 
unfavourable  seasons,  and  the  dispersion  taking  place  in  these 
generations  of  lowest  numbers  is  permanent  and  cumulative.  There 
is  less  difficulty  in  deciding  whether  the  sub-populations  are  suffi- 
ciently well  isolated.  With  a  discontinuous  population  inhabiting 
widely  separated  islands,  it  is  often  possible  to  be  reasonably  sure 
that  there  is  not  too  much  immigration;  and  with  a  continuous 
population  the  size  of  the  "neighbourhoods"  is,  at  least  in  principle, 
measurable.  The  greatest  difficulty  lies  in  estimating  the  intensity  of 
natural  selection  acting  on  the  genes  concerned.  Selection  of  an 
intensity  far  lower  than  could  be  detected  experimentally  is  sufficient 
to  check  dispersion  in  all  but  the  smallest  populations.    It  seems 
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rather  unlikely — though  this  is  no  more  than  an  opinion — that  any 
gene  that  modifies  the  phenotype  enough  to  be  recognised  would 
have  so  little  effect  on  fitness.  The  genes  concerned  with  quantitative 
differences,  which  are  not  individually  recognisable,  may  however  be 
nearly  enough  neutral  for  random  drift  to  take  place.  There  is  no 
doubt  at  all  that  genes  of  this  sort  do  show  random  drift,  at  least  in 
laboratory  populations,  as  will  be  shown  in  Chapter  15.  Of  the 
individually  recognisable  genes,  those  concerned  with  polymorphism 
seem  the  most  likely  to  show  the  effects  of  random  drift.  At  inter- 
mediate frequencies  a  small  displacement  from  the  equilibrium  would 
be  detectable,  and  therefore  a  relatively  small  amount  of  dispersion 
of  the  gene  frequency  might  well  lead  to  recognisable  differentiation. 
The  following  example  will  serve  to  illustrate  the  observed  differen- 
tiation of  a  natural  population,  as  well  as  the  difficulties  of  its  inter- 
pretation. 

Example  4.5.  The  polymorphism  in  respect  of  the  banding  of  the 
shell  in  the  snail  Cepaea  nemoralis  has  been  extensively  studied  by  Lamotte 
(1951)  in  France.  The  population  is  subdivided  into  colonies  with  a  high 
degree  of  isolation  between  them.  The  absence  of  dark-coloured  bands 
on  the  shell  is  caused  by  a  single  recessive  gene.  The  mean  frequency  of 
bandless  snails  is  29  per  cent,  but  individual  colonies  range  between  the 
two  extremes,  some  being  entirely  bandless  and  a  few  entirely  banded. 
The  colonies  vary  in  the  number  of  individuals  that  they  contain,  and  291 
colonies  were  divided  into  three  groups  according  to  their  population 
size.  The  variation  in  the  frequency  of  bandless  snails  was  then  compared 
in  the  three  groups,  as  shown  in  Fig.  4.4.  The  variation  between  the 
colonies,  which  measures  the  degree  of  differentiation,  was  found  to  be 
greater  among  the  small  colonies  than  among  the  large.  The  variance  of 
the  frequency  of  bandless  between  colonies  was  0-067  among  colonies  of 
500-1,000  individuals,  0-048  among  colonies  of  1,000-3,000,  and  0-037 
among  colonies  of  3,000-10,000  individuals.  This  dependence  of  the 
degree  of  differentiation  on  the  population  size  is  interpreted  by  Lamotte 
as  evidence  that  the  differentiation  is  caused  by  random  drift. 

Cain  and  Sheppard  (1954a),  on  the  other  hand,  offer  a  different 
interpretation,  sustained  by  an  equally  thorough  study  of  colonies  in 
England.  They  show  that  predation  by  birds — chiefly  thrushes — exerts  a 
strong  selection  in  favour  of  shell  colours  matching  the  background  of  the 
habitat.  Though  the  polymorphism  is  maintained  by  selection,  of  an 
unknown  nature,  in  favour  of  heterozygotes,  the  frequency  of  the  different 
types  in  any  colony  is  determined  by  selection  in  relation  to  the  nature  of 
the  habitat.  In  the  areas  occupied  by  small  colonies,  they  argue,  there  is 
less  variation  of  habitat  than  in  the  areas  occupied  by  large  colonies.  There- 
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fore  the  variation  of  habitat  between  small  colonies  is  greater  than  between 
large.  This  they  regard  as  the  cause  of  the  greater  differentiation  among 
small  colonies  than  among  large,  selection  bringing  the  frequency  of  band- 
less  forms  to  a  value  appropriate  to  the  mean  habitat  of  the  colony.  It  is 
not  for  us  here  to  attempt  an  assessment  of  these  two  conflicting  interpre- 
tations. 


FREQUENCY  OF  BANDLESS 

Fig.  4.4.     Distribution  of  the  frequency  of  bandless  snails  among 
colonies  of  three  sizes.   (Data  from  Lamotte,  195 1.) 

(a)                     (b)  (c) 

Population  size                                500-1,000        1,000-3,000  3,000-10,000 

Mean  frequency  of  bandless             0-292                  0-256  0-211 

Variance  between  colonies                0-067                  0-048  0-037 


CHAPTER   5 


SMALL  POPULATIONS: 
III.  Pedigreed  Populations  and  Close  Inbreeding 

In  the  two  preceding  chapters  the  genetic  properties  of  small  popu- 
lations were  described  by  reference  to  the  effective  number  of  breeding 
individuals;  and  expressions  were  derived,  in  terms  of  the  effective 
number,  by  means  of  which  the  state  of  dispersion  of  the  gene 
frequencies  could  be  expressed  as  the  coefficient  of  inbreeding.  The 
coefficient  of  inbreeding,  which  is  the  probability  of  any  individual 
being  an  identical  homozygote,  was  deduced  from  the  population  size 
and  the  specified  breeding  structure.  It  expressed,  therefore,  the 
average  inbreeding  coefficient  of  all  individuals  of  a  generation. 
When  pedigrees  of  the  individuals  are  known,  however,  the  coeffi- 
cient of  inbreeding  can  be  more  conveniently  deduced  directly  from 
the  pedigrees,  instead  of  indirectly  from  the  population  size.  This 
method  has  several  advantages  in  practice.  Knowledge  is  often  re- 
quired of  the  inbreeding  coefficient  of  individuals,  rather  than  of  the 
generation  as  a  whole,  and  this  is  what  the  calculation  from  pedigrees 
yields.  In  domestic  animals  some  individuals  often  appear  as  parents 
in  two  or  more  generations,  and  this  overlapping  of  generations  causes 
no  trouble  when  the  pedigrees  are  known.  (Non-overlapping  of 
generations  was  one  of  the  conditions  of  the  idealised  population 
which  we  have  not  yet  removed.)  The  first  topic  for  consideration  in 
this  chapter  is  therefore  the  computation  of  inbreeding  coefficients 
from  pedigrees.  The  second  topic  concerns  regular  systems  of  close 
inbreeding.  When  self-fertilisation  is  excluded  the  rate  of  inbreeding 
expressed  in  terms  of  the  population  size  is  only  an  approximation, 
and  the  approximation  is  not  close  enough  if  the  population  size  is 
very  small.  Under  systems  of  close  inbreeding,  therefore,  the  rate  of 
inbreeding  must  be  deduced  differently,  and  this  is  best  done  also 
by  consideration  of  the  pedigrees. 

When  the  coefficient  of  inbreeding  is  deduced  from  the  pedigrees 
of  real  populations  it  does  not  necessarily  describe  the  state  of  dis- 
persion of  the  gene  frequencies.    It  is  essentially  a  statement  about 
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the  pedigree  relationships,  and  its  correspondence  with  the  state  of 
dispersion  is  dependent  on  the  absence  of  the  processes  that  counteract 
dispersion,  in  particular  on  selection  being  negligible.  We  were  able 
to  use  the  coefficient  of  inbreeding  as  a  measure  of  dispersion  in  the 
preceding  chapters  because  the  necessary  conditions  for  its  relation- 
ship with  the  variance  of  gene  frequencies  were  specified. 


Pedigreed  Populations 

The  inbreeding  coefficient  of  an  individual  is  the  probability 
that  the  pair  of  alleles  carried  by  the  gametes  that  produced  it  were  iden- 
tical by  descent.  Computation  of  the  inbreeding  coefficient  therefore 
requires  no  more  than  the  tracing  of  the  pedigree  back  to  common 
ancestors  of  the  parents  and  computing  the  probabilities  at  each 
segregation.  Consider  the  pedigree  in  Fig.  5.1.  X  is  the  individual 
we  are  interested  in,  whose  parents  are  P  and  Q.  We 
A  want  to  know  what  is  the  probability  that  X  receives 

J  identical  alleles  transmitted  through  P  and  Q  from  A. 

Consider  first  B  and  C.    The  probability  that  they 
B  C    receive  replicates  of  the  same  gene  from  A  is  J,  and  the 

probability  that  they  receive  different  genes  is  J.   But 
if  they  receive  different  genes  from  A,  then  the  prob- 
ability of  these  being  identical  as  a  result  of  previous 
Y    inbreeding  is  the  inbreeding  coefficient  of  A.   There- 

I     fore  the  total  probability  of  B  and  C  receiving  identical 

I  genes  from  A  is  J(i  +FA).  Put  in  other  words,  this  is 

the  probability  that  two  gametes  taken   at  random 


D 


Fig.  5.1  from  A  will  contain  identical  alleles.  Now  consider 
the  rest  of  the  path  through  B.  The  probability  that  B 
passes  the  gene  it  got  from  A  on  to  D  is  ^;  from  D  to  P  is  J,  and  from 
P  to  X  is  \ .  Similarly  for  the  other  side  of  the  ancestry  through  C  and 
Q.  Putting  all  this  together  we  find  the  probability  that  X  receives 
identical  alleles  descended  from  A  is  |(i  +FA)(^y+2,  or 
\(i  +  FA)(^)ni+n2,  where  n1  is  the  number  of  generations  from  one 
parent  back  to  the  common  ancestor  and  n2  from  the  other  parent. 
If  the  two  parents  have  more  than  one  ancestor  in  common  the  separ- 
ate probabilities  for  each  of  the  common  ancestors  have  to  be  summed 
to  give  the  inbreeding  coefficient  of  the  progeny  of  these  parents. 
Thus  the  general  expression  for  the  inbreeding  coefficient  of  an 
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individual  is 

Fx  =  m)n^+1(i+FA)]  (5-1) 

(Wright,  1922).  When  inbreeding  coefficients  are  computed  in  this 
way  it  is  necessary,  of  course,  to  define  the  base  population  to  which 
the  present  inbreeding  is  referred.  The  base  population  might  be  the 
individuals  from  which  an  experiment  was  started  or  a  herd  founded; 
or  it  might  be  those  born  before  a  certain  date.  The  designation  of  an 
individual  as  belonging  to  the  base  population  means  that  it  will  be 
assumed  to  have  zero  inbreeding  coefficient.  When  pedigrees  are 
long  and  complicated  there  may  be  very  many  common  ancestors,  but 
it  is  not  necessary  to  trace  back  all  lines  of  descent.  A  sufficiently 
accurate  estimate  can  be  got  by  sampling  a  limited  number  of  lines  of 
descent  (Wright  and  McPhee,  1925). 

Example  5.1.  As  an  illustration  of  the  use  of  formula  5.J  let  us  consider 
the  hypothetical  pedigree  in  Fig. 
5.2.  The  relevant  individuals  in 
the  pedigree  are  indicated  by 
letters.  Individual  Z  is  the  one 
whose  inbreeding  coefficient  is  to 
be  computed.  Its  parents  are  X 
and  Y,  so  we  have  to  trace  the 
paths  of  common  ancestry  con- 
necting X  with  Y.  There  are 
four  common  ancestors,  A,  B,  C, 
and  H,  and  five  paths  connecting 
X  with  Y  through  them.  We  as- 
sume A,  B,  and  C  to  have  zero  in- 
breeding coefficients,  since  the 
pedigree  tells  us  nothing  about 
their  ancestry.  Individual  H, 
however,  has  parents  that  are 
half  sibs,  and  the  inbreeding 
coefficient    of    H    is    therefore  5'2 


Common          Path  from 
ancestor             X  to  Y 

Generations  to 
common  ancestor: 
from  X  from  Y 

Inbreeding 

coeff.  of 

common  ancestor 

Contribution 

to  inbreeding 

ofZ 

A              KGCADHL 
B               KHDBEJM 
B               KHDBEL 
C              KGCHL 
H              KHL 

4 
4 
4 
3 

2 

4 
4 
3 
3 
2 

0 
0 
0 
0 
\ 

(i)9      =    -00195 

ay  =  -00195 
ay  =  -00391 
ay    =  -00781 
(*)5.1  =  -03516 

Total  by  summation 

0-05078 

F.Q.G. 
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(i)(i+i+i)  =  i#  ^he  computation  of  the  separate  paths  may  now  be  made  as 
shown  in  the  table.  By  addition  of  the  contributions  from  the  five  paths 
we  get  the  inbreeding  coefficient  of  Z  as  Fz  =0-05078,  or  5-1  per  cent. 

"Coancestry."  There  is  another  method  of  computing  inbreed- 
ing coefficients  (Cruden,  1949;  Emik  and  Terrill,  1949)  which  is  more 
convenient  for  many  purposes,  and  is  also  more  readily  adapted  to 
a  variety  of  problems.  We  shall  use  it  later  to  work  out  the  inbreeding 
coefficients  under  regular  systems  of  close  inbreeding.  The  method 
does  not  differ  in  principle  from  the  formula  5.J  given  above,  but 
instead  of  working  from  the  present  back  to  the  common  ancestors 
we  work  forward,  keeping  a  running  tally  generation  by  generation, 
and  compute  the  inbreeding  that  will  result  from  the  matings  now 
being  made.  The  inbreeding  coefficient  of  an  individual  depends  on 
the  amount  of  common  ancestry  in  its  two  parents.  Therefore, 
instead  of  thinking  about  the  inbreeding  of  the  progeny,  we  can  think 
of  the  degree  or  relationship  by  descent  between  the  two  parents. 
This  we  shall  call  the  coancestry  of  the  two  parents,  and  symbolise  it 
by  /.  It  is  identical  with  the  inbreeding  coefficient  of  the  progeny, 
and  is  the  probability  that  two  gametes  taken  one  from  one  parent 
and  one  from  the  other  will  contain  alleles  that  are  identical  by 
descent.  (Malecot,  1948,  calls  this  the  "coefficient  de  parente,"  but 
the  translation  "coefficient  of  relationship"  cannot  be  used  because 
Wright  (1922)  has  used  this  term  with  a  different  meaning.) 

Consider  the  generalised  pedigree  in  Fig.  5.3. 
X  is  an  individual  with  parents  P  and  Q  and  grand-     A  x  B     C  x  D 
parents  A,  B,  C,  and  D.   Now,  the  coancestry  of  P  j  J 

with  Q  is  fully  determined  by  the  coancestries  relating         P     x     Q 
A  and  B  with  C  and  D,  and  if  these  are  known  we  | 

need  go  no  further  back  in  the  pedigree.    It  can  be  X 

shown  that  the  coancestry  of  P  with  Q  is  simply  the  Fig.  5.3 
mean  of  the  four  coancestries  AC,  AD,  BC,  and  BD. 
This  will  be  clearer  if  stated  in  the  form  of  probabilities,  though  the 
explanation  is  cumbersome  when  put  into  words.  Take  one  gamete 
at  random  from  P  and  one  from  Q,  and  repeat  this  many  times.  In 
half  the  cases  P's  gamete  will  carry  a  gene  from  A  and  in  half  from  B: 
similarly  for  Q's  gamete.  So  the  two  gametes,  one  from  P  and  one 
from  Q,  will  carry  genes  from  A  and  C  in  a  quarter  of  the  cases,  from 
A  and  D  in  a  quarter,  from  B  and  C  in  a  quarter,  and  from  B  and  D 
in  a  quarter  of  the  cases.    Now  the  probability  that  two  gametes 


taken  at  random,  one  from  A  and  the  other  from  C,  are  identical  by 
descent  is  the  coancestry  of  A  with  C,  i.e.  fAC  etc.  So,  reverting  now 
to  symbols, 

/pq  —  J/ac  +  J/ad  +  J/bc  +  J/bd 

This  gives  the  basic  rule  relating  coancestries  in  one  generation  with 
those  in  the  next: 
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Fx  =/pq  — |  (Ac  +/ad  +/bc  +/bd) 


■(5-2) 


With  this  rule  the  experimenter  can  tabulate  the  coancestries  genera- 
tion by  generation,  and  this  gives  a  basis  for  planning  matings  and 
computing  inbreeding  coefficients.  More  detailed  accounts  of  the 
operation  are  given  by  Cruden  (1949),  Emik  and  Terrill  (1949),  and 
Plum  (1954). 

If  there  is  overlapping  of  generations  it  may  happen  that  we  must 
find  the  coancestry  between  individuals  belonging  to  different 
generations.  This  situation  is  covered  by  the  following  supplementary 
rules,  which  can  readily  be  deduced  by  a  consideration  of  probabilities 
in  the  manner  explained  above.    Referring  to  the  same  pedigree 

(Fig-  5-3)> 


and 


/pc: 

/pD 
/PQ 


J(/ac+/bo] 


K/ad+/bd) 


K/i 


PC 


+/pd) 


(5.3) 


which  by  substitution  reduces  to  the  basic  rule. 

Before  we  can  apply  this  method  to  systems  of  close  inbreeding 
we  have  to  see  how  the  basic  rule  is  to  be  applied  when  there  are 
fewer  than  four  grandparents.  As  an  example 
we  shall  consider  the  coancestry  between  a 
pair  of  full  sibs.  The  pedigree  can  be  written 
as  in  Fig.  5.4:  A  and  B  are  parents  of  both  Px 
and  P2,  which  are  full  sibs  and  have  an  off- 
spring X.  Applying  the  basic  rule  (equation 
5.2),  and  noting  that  /BA  =/AB,  we  have 


B 


B 


Pi 

1 


X 

Fig.  5.4 


^x  =/Pip2  =  K/aa  +/bb  +  2/Ab) 


(54) 


The  meaning  of /AA,  the  coancestry  of  an  individual  with  itself,  is  the 
probability  that  two  gametes  taken  at  random  from  A  will  contain 
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identical  alleles,  and  we  have  already  seen  that  this  probability  is 
equal  to  |(i  +FA).  The  value  of  FA  will  be  known  from  the  coancestry 
of  A's  parents.  The  coancestry  between  offspring  and  parent  can  be 
found  in  a  similar  way,  by  application  of  the  supplementary  rules  in 
5.5.  Substituting  the  individuals  in  Fig.  5.4  for  those  in  Fig.  5.3  and 
applying  the  first  two  equations  of  5. 3  gives 

/pa  =  2  (/aa  +/ab)  1  /     -x 

/pB  —  K/BB  +/ab)  J 

where  P  is  equivalent  to  either  Px  or  P2;  and  applying  the  third 
equation  of  5.5  gives  the  coancestry  between  full  sibs 

/piPa  =  J(/pa  +/pb) 

=  K/aa+/bb  +  2/Ab) 

as  above.  We  now  have  all  the  rules  needed  for  computing  the  in- 
breeding coefficients  in  successive  generations  under  regular  systems 
of  inbreeding. 


Regular  Systems  of  Inbreeding 

The  consequences  of  regular  systems  of  inbreeding  have  been 
the  subject  of  much  study.  They  were  first  described  in  detail  by 
Wright  ( 1 921)  in  a  series  of  papers  which  form  the  foundation  of  the 
whole  theory  of  small  populations.  Wright's  studies  were  based  on 
the  method  of  path  coefficients  (Wright,  1934,  1954).  Haldane 
(1937,  1955)  and  Fisher  (1949)  derived  the  consequences  by  the 
method  of  matrix  algebra.  The  inbreeding  coefficients  in  successive 
generations  can,  however,  be  more  simply  derived  by  application  of 
the  rules  of  coancestry  explained  in  the  previous  section,  and  this  is 
the  method  we  shall  follow  here.  We  shall  illustrate  the  application 
of  the  method  for  consecutive  full-sib  mating,  which  is  one  of  the  most 
commonly  used  systems,  and  give  the  results  for  some  other  systems. 
The  inbreeding  coefficients  refer  to  autosomal  genes;  the  results  for 
sex-linked  genes  are  described  by  Wright  (1933)  in  a  paper  which 
also  contains  a  useful  summary  of  the  results  for  autosomal  genes  in 
a  great  variety  of  mating  systems. 

Full-sib  mating.    The  equation  5.4  given  above  for  the  co- 
ancestry between  full  sibs  can  be  applied  to  successive  generations  to 
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System  of  mating  Recurrence  equation 

Self-fertilisation,      or      re- 
peated backcrosses  to  highly      |(i  +Ft_1) 
inbred  line. 

Full  brother  x  sister,  or  off- 
spring x  younger  parent: 

Inbreeding  coefficient.  J(i  +  2Ft_±  +Ft_2) 

Probability  of  fixation 
(from  Schafer,  1937). 

Half     sib      (females      half      J(i  +6Ft-1+Ft_2) 
sisters). 

Repeated     backcrosses     to       J(i  +zFt_1) 
random-bred  individual. 
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give  the  inbreeding  coefficients  under  continued  full-sib  mating. 
But  it  is  more  convenient  to  rearrange  the  equation  so  that  the  in- 
breeding coefficient  is  given  in  terms  of  the  inbreeding  coefficients 
of  the  previous  generations.  Note  first  that,  because  the  mating  sys- 
tem is  regular,  contemporaneous  individuals  have  the  same  inbreeding 
coefficients  and  coancestries:  so,  referring  again  to  the  pedigree  in 
Fig.  5.4>/aa=/bb>  and  Fa  =  Fb.  Now,  if  we  let  t  be  the  generation  to 
which  individual  X  belongs,  then/AB  =Ft_lf  and/AA  =/BB  =i(i  +-^-2)- 
The  coancestry  equation  can  therefore  be  rewritten  to  give  the 
inbreeding  coefficient  in  any  generation,  ty  in  terms  of  the  inbreeding 
coefficients  of  the  previous  two  generations,  thus: 

Ft=l(i+zFt_1+Ft_2)  (5.6) 

This  recurrence  equation  enables  us  to  write  down  the  inbreeding 
coefficients  in  successive  generations.  In  the  first  generation  Ft_± 
and  Ft_2  are  both  zero  and  so  F(t=1)  =0-25.  The  inbreeding  coeffi- 
cients in  the  first  four  generations  are  0-25,  0-375,  0-50,  and  0-59. 
The  rate  of  inbreeding  is  not  constant  in  the  first  few  generations,  as 
may  be  seen  by  computing  AF  from  equation  3.9.  For  the  first  four 
generations  AF  is  0-25,  0-17,  0-20,  and  0-19.  It  later  settles  down 
to  a  constant  value  of  0-191  (Wright,  193 1).  The  inbreeding  co- 
efficients over  the  first  20  generations  of  full-sib  mating  are  given  in 
Table  5.1. 

Some  other  systems  of  mating  may  now  be  mentioned  briefly. 
Self-fertilisation  gives  the  most  rapid  inbreeding.  If  X  is  the  off- 
spring of  P,  we  have  from  the  coancestry  identities 

Fx=f?r  =  l(i+Fj,) 

and  the  recurrence  equation  is  therefore 

Ft=i(i  +Ft_1)  {5.7) 

The  inbreeding  coefficients  over  the  first  ten  generations  of  self- 
fertilisation  are  given  in  Table  5.1.  The  rate  of  inbreeding  is  con- 
stant from  the  beginning;  AF=o-$  exactly. 

Parent-offspring  mating,  in  which  offspring  are  mated  to  the 
younger  parent,  gives  the  same  series  of  inbreeding  coefficients  as 
full-sib  mating  for  autosomal  genes,  but  for  sex-linked  genes  it  gives 
a  slightly  higher  rate  of  inbreeding.  For  sex-linked  genes  AF  is  0-293 
after  the  first  few  generations  (Wright,  1933). 
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Half-sib  mating  is  usually  between  paternal  half  sibs,  one  male  being 
mated  to  two  or  more  of  his  half  sisters.  If  these  females  are  half 
sisters  of  each  other  the  recurrence  equation  is 


Ft=i(x+6Ft_1+Ft 


■(5-8) 


The  first  20  generations  are  given  in  Table  5.1.  There  are,  however, 
practical  difficulties  in  the  way  of  maintaining  this  system  regularly, 
and  sometimes  females  that  are  full  sisters  of  each  other  have  to  be 
used.  The  inbreeding  will  then  go  a  little  faster.  If  full-sister  females 
are  always  used  the  recurrence  equation  is 


Ft=M3  +  8^_1+4F<_2+^_8) 

Repeated  backcrosses  to  an  individual  or  to  a  highly 
inbred  line  are  often  made,  for  a  variety  of  purposes. 
The  resulting  inbreeding  is  as  follows.  The  pedigree 
(Fig.  5.5)  shows  an  individual,  A,  which  will  probably 
be  a  male,  mated  to  his  daughter,  C,  his  granddaughter, 
D,  etc.  From  the  supplementary  rule  (5.5) 

Fx  =/ad  =  J(/aa  +/ac) 
The  recurrence  equation  is  therefore 

^  =  i(l+^A  +  2^-l) 


(5-9) 


B 


I 
D 


X 

I 

X 

Fig.  5.5 


(5-io) 


I  where  FA  is  the  inbreeding  coefficient  of  the  individual  to  which  the 
j  repeated  backcrosses  are  made.  If  A  is  an  individual  from  the  base 
J  population  andFA  =  o,  the  equation  becomes 


F(  =  1(1+2.^) 


(5-ii) 


The  inbreeding  coefficients  over  the  first  9  generations  are  given  in 
Table  5.1.  If  A  is  an  individual  from  a  highly  inbred  line  and  FA  =  1 , 
the  equation  becomes 


Ft=i(i+Ft_0 


■(5-i2) 


which  is  identical  with  self-fertilisation.  In  this  case  A  need  not  be 
the  same  individual  in  successive  generations:  it  can  be  any  member 
of  the  inbred  line. 
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Example  5.2.  As  an  example  of  the  use  of  coancestry  for  computing 
inbreeding  coefficients  let  us  consider  populations  derived  from  "2-way" 
and  from  "4-way"  crosses  between  highly  inbred  lines.  In  a  2-way  cross 
two  inbred  lines  are  crossed  and  the  population  is  maintained  by  random 
mating  among  the  cross-bred  individuals  and  subsequently  among  their 
progeny.  In  a  4-way  cross  four  inbred  lines  are  crossed  in  two  pairs,  and 
the  two  cross-bred  groups  are  again  crossed,  subsequent  generations  being 
maintained  by  random  mating.  If  the  base  population  is  taken  to  be  a  real, 
or  hypothetical,  random-bred  population  from  which  the  inbred  lines  were 
derived,  we  may  compute  the  inbreeding  coefficients  of  the  population 
derived  from  the  cross,  referring  it  to  this  base.  The  crosses  and  sub- 
sequent generations  are  shown  schematically  in  the  diagram  below. 

Generation  2-way  cross  4-way  cross 

1  AxB  AxB         CxD 


Xx  x  X2  Xjl    X2        Y2    Y2 

Xj  x  Y1        X2  x  Y2 

1  '        1 

O  Zj      x       Za 

I 
4  O 

The  inbred  lines  are  represented  by  A,  B,  C,  and  D.  If  they  are  fully 
inbred,  as  we  shall  take  them  to  be,  the  coefficient  of  inbreeding  of  the 
individuals  from  the  lines  is  1,  and  the  coancestry  of  an  individual  with 
another  of  the  same  line  is  also  1.  Therefore  only  one  individual  of  each 
line  need  be  represented  in  the  scheme,  even  though  any  number  may 
actually  be  used.  The  progeny  of  the  crosses  between  the  inbred  lines  are 
represented  by  X  and  Y,  the  suffices  1  and  2  indicating  different  individuals. 
In  the  2-way  cross  the  progeny  of  these  cross-bred  individuals  are  the 
foundation  generation  whose  inbreeding  coefficient  we  are  to  compute. 
They  are  represented  by  O.  In  the  4-way  cross  the  two  sorts  of  cross-bred 
individuals,  X  and  Y,  are  crossed,  one  sort  with  the  other.  Two  such 
matings  are  represented  in  the  scheme.  They  produce  the  "double-cross" 
individuals,  Z,  whose  progeny  constitute  the  foundation  generation  repre- 
sented by  O,  whose  inbreeding  coefficient  we  are  to  compute. 

In  the  computation  of  the  coancestries  we  shall  omit  the  symbol  /, 
writing  for  example  AB  for  /AB,  the  coancestry  of  individual  A  with  in- 
dividual B.  The  coancestries  of  the  parents  in  generation  1  are 

AA=BB=CC=DD=i 
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The  coancestries  in  the  second  generation  of  the  2-way  cross  are 

X±X2  =  J(AA  +  BB  +  AB  +  BA)    (by  equation  5.2) 
=  |(    1     +1+0+0) 


Therefore  F0  =  0-5,  which  is  the  required  inbreeding  coefficient  of  the 
foundation  generation  of  the  population  derived  from  the  2-way  cross. 
The  subsequent  matings  between  the  O  individuals  need  produce  no 
further  inbreeding  provided  enough  2nd  generation  matings  are  made. 
The  coancestries  in  the  second  generation  of  the  4-way  cross  are 


and 


XXX2  =  Y^  =  J    (as  shown  for  the  2-way  cross) 
XXY2  =  X2YX  =  i(AC  +  AD  +  BC  +  BD)  =  o 


The  coancestries  in  the  third  generation  are 

ZXZ2  =  i(XxX2  +  YXY2  +  XXY2  +  X2YX) 


i( 


_  1 

~~4 


Therefore  the  inbreeding  coefficient  of  the  foundation  generation  is 
^0  =  0-25.  Again,  the  inbreeding  need  not  increase  further,  provided 
enough  third  generation  matings  are  made. 

The  meaning  of  these  coefficients  of  inbreeding,  with  the  base  popula- 
tion as  stated,  may  be  clarified  thus.  If  we  made  a  large  number  of  2-way, 
or  of  4-way,  crosses  each  with  a  different  set  of  inbred  lines,  the  populations 
derived  from  the  crosses  would  constitute  a  set  of  lines  or  sub-populations. 
The  inbreeding  coefficients  would  then  indicate  the  expected  amount  of 
dispersion  of  gene  frequencies  among  these  lines.  Populations  derived 
from  2-way  crosses  are  equivalent  to  progenies  of  one  generation  of  self- 
fertilisation.  The  gene  frequencies  can  therefore  have  only  three  values, 
o,  J,  and  1.  Populations  derived  from  4-way  crosses  are  equivalent  to 
progenies  of  one  generation  of  full-sib  mating,  and  the  gene  frequencies 
can  have  only  five  values,  o,  J,  J,  f ,  and  1 . 


Reference  to  a  different  base  population.  Having  computed 
a  coefficient  of  inbreeding  with  reference  to  a  certain  group  of  indi- 
viduals as  the  base  population,  one  may  then  want  to  change  the  base 
and  refer  the  inbreeding  coefficient  to  another  group  of  individuals. 
One  might,  for  example,  compute  the  inbreeding  coefficient  of  a  herd 
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of  cattle  referred  to  the  foundation  animals  of  the  herd  as  the  base, 
and  then  want  to  recompute  the  inbreeding  coefficient 

so  as  to  refer  to  the  breed  as  a  whole  with  a  base  popula-  A 

tion  in  the  more  remote  past.   Let  X  represent  the  group  J, 

of  individuals  whose  inbreeding  coefficient  is  required,  B 

and  let  A  and  B  represent  ancestral  groups,  A  being  more  ! 

remote  than  B,  as  shown  in  Fig.  5.6.  Then  it  follows  from  X 

equation  3. II  that  Fig.  5.6 

Px.a  =  Px.bPb.a  (5.13) 

where  Px  A  =  i  -Fx.a1  Fx.a  being  the  inbreeding  coefficient  of  X 
referred  to  A  as  base,  and  similarly  for  the  other  subscripts. 

Example  5.3.  A  selection  experiment  with  mice  was  started  from  a 
foundation  population  made  by  a  4-way  cross  of  highly  inbred  lines 
(Falconer,  1953).  According  to  the  computation  given  above  in  Example 
5.2,  the  inbreeding  coefficient  of  this  foundation  population  was  reckoned 
to  be  25  per  cent.  On  this  basis  the  inbreeding  coefficients  of  subsequent 
generations  were  computed  from  the  pedigrees  by  the  coancestry  method.  The 
inbreeding  coefficient  at  generation  24,  computed  thus,  was  58-8  per  cent. 
What  would  the  inbreeding  coefficient  be  if  referred  to  the  foundation 
population  as  base,  instead  of  to  the  more  remote  hypothetical  population 
from  which  the  inbred  lines  were  derived?  The  figures  to  be  substituted  in 

equation  3-13  are  Px.a  =  °*4i2  and  Pb.a  =  0'75-    Therefore  Px.b  =      — 

=  0-549.    The  inbreeding  coefficient  at  generation  24,  referred  to  the 
foundation  population  as  base,  is  therefore  45-1  per  cent. 

We  may  use  this  population  of  mice  also  to  compare  the  rate  of  in- 
breeding when  computed  by  the  two  methods,  from  the  pedigrees  and  from 
the  effective  population  size.  Computed  from  the  pedigrees,  the  average 
rate  of  inbreeding  over  the  24  generations  is  found  from  equation  3.12 
thus:  0-451  =  1  -(1  -zlF)24,  whence  AF  =  2-4.7  Per  cent-  The  population 
was  maintained  by  six  pairs  of  parents  in  each  generation.  Matings  were 
made  between  individuals  with  the  lowest  coancestries  and  this  has  the 
effect  of  equalising  family  size,  as  explained  in  the  previous  chapter. 
Therefore,  by  equation  4.8,  the  effective  number  was  twice  the  actual,  i.e. 

Ne  =  24.  The  rate  of  inbreeding,  by  equation  4.1,  is  therefore  AF = —=  =  2-08 

48 

per  cent.     The  slightly  higher  rate  of  inbreeding  as  computed  directly 

from  the  pedigrees  can  be  attributed  to  some  irregularities  in  the  mating 

system,  resulting  from  the  sterility  of  some  parents  and  the  death  of  some 

whole  litters.  The  random  drift  of  a  colour  gene  in  this  line,  and  two  others 

maintained  in  the  same  manner,  was  shown  in  Fig.  3.2. 
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Fixation.  One  is  often  more  interested  in  the  probability  of 
fixation  as  a  consequence  of  inbreeding  than  in  the  inbreeding  coeffi- 
cient. The  inbreeding  coefficient  gives  the  probability  of  an  indi- 
vidual being  a  homozygote,  which  is  i  -  2p0q0(i  -F)  from  Table  3.1. 
But  one  wants  to  know  also  how  soon  all  individuals  in  a  line  can  be 
expected  to  be  homozygous  for  the  same  allele.  This  is  the  "purity" 
implied  by  the  term  "pure  line"  which  is  often  used  to  mean  highly 
inbred  line.  The  degree  of  "purity"  is  the  probability  of  fixation. 
The  probability  of  fixation  has  been  worked  out  by  Haldane  (1937, 
1955),  Schafer  (1937)  and  Fisher  (1949).  It  depends  on  the  number 
of  alleles  and  their  arrangement  in  the  initial  mating  of  the  line.  The 
probabilities  of  fixation  over  the  first  20  generations  of  full-sib  mating 
are  given  in  Table  5.1,  when  4  alleles  were  present  in  the  initial 
mating.  There  cannot,  of  course,  be  more  than  4  alleles  in  a  sib- 
mated  line,  and  when  there  are  fewer  the  probability  of  fixation  is 
greater  (see  Haldane,  1955). 

Linkage.  Linkage  introduces  a  problem  in  connexion  with  the 
consequences  of  inbreeding  of  which  a  solution  is  sometimes  needed. 
Individuals  heterozygous  at  a  particular  locus  will  also  be  hetero- 
zygous for  a  segment  of  chromosome  in  which  the  locus  lies,  and  it 
may  be  of  interest  to  know  the  average  length  of  heterozygous 
segments.  The  form  in  which  this  problem  most  commonly  arises  is 
connected  with  the  transference  of  a  marker  gene  to  an  inbred  line  by 
repeated  backcrosses,  when  one  wants  to  know  how  much  of  the 
foreign  chromosome  is  transferred  along  with  the  marker.  This 
problem  has  been  worked  out  by  Bartlett  and  Haldane  (1935).  A 
dominant  gene  can  be  transferred  by  successive  crosses  of  the  hetero- 
zygote  to  the  strain  into  which  it  is  to  be  introduced.  In  this  case  the 
mean  length  of  chromosome  introduced  with  the  gene  after  t  crosses 
is  ijt  cross-over  units  on  each  side  of  the  gene.  A  recessive  gene  is 
commonly  transferred  by  alternating  backcrosses  and  intercrosses 
from  which  the  homozygote  is  extracted.  The  mean  length  of  foreign 
chromosome  in  this  case  is  z\t  cross-over  units  on  each  side,  after  t 
cycles.  Other  cases  are  described  in  the  paper  cited.  From  this  and  a 
knowledge  of  the  total  map  length  of  the  organism  we  can  arrive  at 
the  expected  proportion  of  the  total  chromatin  that  is  still  hetero- 
geneous. 

Example  5.4.  What  percentage  of  the  total  chromatin  is  expected  to 
be  still  heterogeneous  after  a  dominant  gene  has  been  transferred  to  an 
inbred  strain  of  mice  by  five,  and  by  ten  successive  backcrosses?    The 
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Fig.  5.7.  Theoretical  models  illustrating  the  distribution  of 
heterozygous  segments  of  chromosome  (shown  black)  after  (a)  5 
generations,  (b)  10  generations,  and  (c)  20  generations  of  full-sib 
mating,  in  an  organism  with  twenty  chromosomes,  such  as  the 
mouse.  The  total  map-length  is  taken  to  be  2500  centimorgans, 
and  the  chromosomes  are  assumed  to  be  of  equal  genetic  length. 
The  points  marked  A,  B,  C,  D,  in  chromosomes  I  to  IV  are  loci 
held  heterozygous  by  forced  segregation,  and  the  associated  hetero- 
zygous segments  are  cross-hatched.  (From  Fisher,  The  Theory  of 
Inbreeding,  Oliver  and  Boyd,  1949;  reproduced  by  courtesy  of  the 
author  and  publishers.) 

expected  length  of  heterogeneous  chromosome  associated  with  the  gene 
is  0*2.  centimorgans  after  five  crosses,  and  o-i  cM  after  ten.  The  average 
map  length  of  the  20  chromosomes  in  male  mice  is  977  cM  (Slizynski, 
1955).  Therefore  0-2  per  cent  of  the  chromosome  will  be  heterogeneous 
after  five  crosses,  and  o-i  per  cent  after  ten,  assuming  that  the  gene  is 
transferred  through  males,  and  taking  the  average  as  being  the  length  of 
the  chromosome  carrying  the  gene.  The  percentage  of  chromatin  not 
associated  with  the  gene  that  is  expected  still  to  be  heterogeneous  can  be 
taken  as  approximately  i-Ft  from  column  A  of  Table  5.1:  that  is,  3-1  per 
cent  after  five  crosses  and  o-i  per  cent  after  ten.  The  total  percentage  of 
heterogeneous  chromatin  is  therefore  3-4  per  cent  after  five  crosses,  and 
0-2  per  cent  after  ten. 
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The  more  general  problem  of  the  mean  length  of  heterozygous 
segments  during  inbreeding  has  been  treated  by  Haldane  (1936)  and 
by  Fisher  (1949).  It  need  not  be  discussed  in  detail  here.  The  con- 
clusions are  well  illustrated  in  Fig.  5.7,  which  is  Fisher's  diagrammatic 
representation  of  the  situation  in  an  organism  with  20  chromosomes, 
such  as  the  mouse,  after  five,  ten,  and  twenty  generations  of  full-sib 
mating.  The  diagrams  show  the  expected  number  and  lengths  of 
unfixed  segments.  The  first  four  chromosomes  are  supposed  to  carry 
loci  at  which  segregation  is  maintained  by  mating  always  hetero- 
zygotes  with  homozygotes.  The  slower  reduction  of  the  lengths  of 
these  unfixed  segments  can  be  seen. 

Mutation.  After  a  long  period  of  inbreeding  mutation  may  be- 
come an  important  factor  in  determining  the  frequency  of  hetero- 
zygotes.  If  u  is  the  mutation  rate  of  a  gene  that  has  reached  near- 
fixation  in  the  line,  then  the  frequency  of  heterozygotes  at  this  locus 
due  to  mutation  is  \u  under  self-fertilisation,  and  \zu  under  full-sib 
mating,  for  autosomal  loci  (Haldane,  1936).  These  are  very  small 
frequencies  if  we  are  concerned  with  only  one  locus,  but  if  the  effects 
of  all  loci  are  taken  together  mutation  is  not  entirely  negligible  as  a 
source  of  heterozygosis  in  long  inbred  strains  such  as  the  widely  used 
strains  of  mice.  The  practical  consequences  of  the  origin  of  hetero- 
geneity by  mutation  are  that  the  characteristics  of  a  line  will  slowly 
change  through  the  fixation  of  mutant  alleles,  and  that  sub-lines  will 
become  differentiated.  Examples  are  given  in  Chapter  15. 

Selection  favouring  heterozygotes.  When  close  inbreeding  is 
practised  the  object  is  generally  to  produce  fixation,  or  homozygosis 
within  the  lines,  and  the  experimenter  is  not  usually  interested  in  the 
differentiation  between  lines.  It  is  therefore  a  matter  of  little  concern 
which  allele  is  fixed,  so  long  as  fixation  occurs.  Selection  against  a 
deleterious  recessive  may  prevent  the  deleterious  allele  becoming 
fixed,  but  it  will  not  prevent  or  delay  the  fixation  of  the  more  favour- 
able allele.  Therefore  the  conclusions  about  selection  reached  in  the 
previous  chapter  are  of  little  relevance  to  close  inbreeding.  Selection 
that  favours  heterozygotes,  however,  is  another  matter.  A  conse- 
quence of  inbreeding  almost  universally  observed  is  a  reduction  of 
fitness,  the  reasons  for  which  will  be  given  in  Chapter  14.  Thus 
selection  resists  the  inbreeding,  since  the  more  homozygous  indi- 
viduals are  the  less  fit,  and  this  can  only  mean  that  selection  favours 
heterozygotes — not  necessarily  heterozygotes  of  the  loci  taken  singly, 
but  heterozygotes  of  segments  of  chromosome.    It  is  only  necessary 
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to  have  two  deleterious  genes,  recessive  or  partially  recessive,  linked 
in  repulsion,  to  confer  a  selective  advantage  on  the  heterozygote  of 
the  segment  of  chromosome  within  which  the  genes  are  located.  It  is 
therefore  important  to  find  out  how  the  opposing  tendencies  of 
inbreeding  and  selection  in  favour  of  heterozygotes  balance  each 
other,  in  order  to  assess  the  reliability  of  the  computed  inbreeding 
coefficient  as  a  measure  of  the  probability  of  fixation. 

The  outcome  of  the  joint  action  of  inbreeding  and  selection  in 
favour  of  heterozygotes  depends  on  whether  there  is  replacement  of 
the  less  fit  lines  by  the  more  fit;  in  other  words,  on  whether  selection 
operates  between  lines  or  only  within  lines.  Within  any  one  line, 
selection  against  homozygotes  only  delays  the  progress  toward 
fixation  and  cannot  arrest  it,  the  delay  being  roughly  in  proportion  to 
the  intensity  of  the  selection  (Reeve,  19550).   Table  5.2  shows  the 

Table  5.2 

Rate    of   inbreeding,    AF,    with    selection   favouring   the 
heterozygote.    (Except  with  self-fertilisation,  the  rates  are 
only  approximate  over  the  first  few  generations  of  in- 
breeding.) 


Coefficient  of 
selection  against 

"*    \/o) 

Self- 

the  homozygotes 

fertilisation 

Full  sib 

Half  sib 

v) 

0 

50-00 

19-10 

13-01 

0-2 

44'44 

14-88 

9-32 

0-4 

37-5o 

10-32 

5-6? 

o-6 

28-57 

57i 

2-48 

075 

20-00 

2-62 

0-82 

o-8 

16-67 

1-76 

0-46 

*  Females  full  sisters  to  each  other. 


rates  of  inbreeding  with  various  intensities  of  selection,  when  there 
are  two  alleles  and  selection  acts  equally  against  both  homozygotes. 
(The  rate  of  inbreeding,  AF,  is  used  here  to  mean  the  rate  of  dispersion 
of  gene  frequencies  and,  after  the  first  few  generations  when  the 
distribution  of  gene  frequencies  has  become  flat,  it  measures  the  rate 
of  fixation — i.e.  the  proportion  of  unfixed  loci  that  become  fixed  in 
each  generation — as  explained  in  Chapter  3.)  The  delay  of  fixation 
caused  by  selection  is  least  under  the  closest  systems  of  inbreeding. 
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Thus  the  rate  is  halved  under  self-fertilisation  when  the  coefficient 
of  selection  is  0-67;  under  full-sib  mating  when  it  is  0-44;  and  under 
half-sib  mating  when  it  is  0-35.  It  will  be  seen  from  the  table  that  the 
rate  of  inbreeding,  though  much  reduced  by  intense  selection,  does 
not  become  zero  until  the  coefficient  of  selection  rises  to  1 .  If  there  is 
only  one  line,  therefore,  fixation  eventually  goes  to  completion,  unless 
both  homozygotes  are  entirely  inviable  or  sterile. 

If  there  are  many  lines,  however,  selection  may  arrest  the  progress 
of  fixation  and  lead  to  a  state  of  equilibrium,  for  the  following  reason. 
The  amount  by  which  the  inbreeding  has  changed  the  frequency  of  a 
particular  gene  from  its  original  value  differs  at  any  one  time  from 
line  to  line.  In  other  words,  the  state  of  dispersion  of  the  locus  has 
gone  further  in  some  lines  than  in  others.  Now,  if  those  lines  in 
which  the  dispersion  has  gone  furthest,  and  which  are  consequently 
most  reduced  in  fitness,  die  out  or  are  discarded,  and  if  they  are 
replaced  by  sub -lines  taken  from  the  lines  in  which  it  has  gone  least 
far,  then  the  progress  of  the  dispersive  process  will  have  been  set 
back.  When  there  is  replacement  of  lines  in  this  way,  and  the  selec- 
tion is  sufficiently  intense,  a  state  of  balance  between  the  opposing 
tendencies  of  inbreeding  and  selection  is  reached.  The  intensity  of 
selection  needed  to  arrest  the  dispersive  process  has  been  worked  out 
for  regular  systems  of  close  inbreeding  (Hayman  and  Mather,  1953). 
Some  of  the  conclusions,  for  the  case  of  two  alleles  with  equal  selec- 
tion against  the  two  homozygotes,  are  given  in  Table  5.3,  which 
shows  the  intensity  of  selection  against  the  homozygotes  which  will 
(a)  just  allow  fixation  to  go  eventually  to  completion,  and  (b)  arrest 

Table  5.3 

Balance  between  inbreeding  and  selection  in  favour  of 
heterozygotes,  when  selection  operates  between  lines.  The 
figures  are  the  selective  disadvantages  of  homozygotes,  sf 
expressed  as  percentages.  Column  (a)  shows  the  highest 
value  of  ^  compatible  with  complete  fixation.  Column  (b) 
shows  the  value  of  s  that  leads  to  a  steady  state  at 
P=i-F  =  o-5. 


(a) 

(b) 

Mating  system 

(P  =  o) 

(iVo-5) 

Self-fertilisation 
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667 

Full-sib 

237 

44-6 

Half-sib 
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the  dispersive  process  at  a  point  of  balance  where  the  frequency  of 
heterozygotes  is  half  its  original  value,  i.e.  where  P=  i-F=o-$. 
These  figures  show  that  only  a  moderate  advantage  of  heterozygotes 
will  suffice  to  prevent  complete  fixation.  Under  full-sib  mating,  for 
example,  loci,  or  segments  of  chromosomes  that  do  not  recombine, 
with  a  25  per  cent  disadvantage  in  homozygotes  will  not  all  go  to 
fixation.  And,  of  those  with  a  50  per  cent  disadvantage,  only  about 
half  will  become  fixed,  no  matter  for  how  long  the  inbreeding  is 
continued. 

It  must  be  stressed,  however,  that  prevention  of  fixation  in  this 
way  can  only  take  place  when  there  is  replacement  of  lines  and  sub- 
lines. The  following  breeding  methods,  for  example,  would  allow 
replacement  of  lines:  if  seed,  set  by  self-fertilisation,  were  collected 
in  bulk  and  a  random  sample  taken  for  planting,  and  this  were  re- 
peated in  successive  generations;  or,  if  sib  pairs  of  mice  were  taken  at 
random  from  all  the  surviving  progeny,  so  that  the  same  amount  of 
breeding  space  was  occupied  in  successive  generations. 

The  conclusions  outlined  above  refer  to  a  single  locus.  If  there 
were  more  than  a  few  loci  on  different  chromosomes  all  subject  to 
selection  against  homozygotes  of  an  intensity  sufficient  to  arrest  or 
seriously  delay  the  progress  of  inbreeding,  the  total  loss  of  fitness 
from  all  the  loci  would  be  very  severe.  Inbred  lines  of  organisms 
with  a  high  reproductive  rate,  such  as  plants  and  Drosophila,  might 
well  stand  up  to  a  total  loss  of  fitness  sufficient  to  keep  several  loci 
or  segments  of  chromosome  permanently  unfixed.  But  the  loss  of 
fitness  involved  in  preventing  the  fixation  of  more  than  two  or  three 
loci  in  an  organism  such  as  the  mouse  would  be  crippling.  Under 
laboratory  conditions  the  highly  inbred  strains  of  mice,  after  100  or 
more  generations  of  sib-mating,  have  a  fitness  not  much  less  than  half 
that  of  non-inbred  strains.  It  is  conceivable  that  they  might  have  one 
locus  permanently  unfixed,  but  it  is  difficult  to  believe  that  they  can 
have  more.  Complete  lethality  or  sterility  of  both  homozygotes  at 
one  locus  means  a  50  per  cent  loss  of  progeny;  at  two  unlinked  loci,  a 
75  per  cent  loss.  A  mouse  strain  with  a  mortality  or  sterility  of  50 
per  cent  can  be  kept  going,  but  hardly  one  with  75  per  cent. 


F.Q.G. 
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CONTINUOUS  VARIATION 

It  will  be  obvious,  to  biologist  and  layman  alike,  that  the  sort  of 
variation  discussed  in  the  foregoing  chapters  embraces  but  a  small 
part  of  the  naturally  occurring  variation.  One  has  only  to  consider 
one's  fellow  men  and  women  to  realise  that  they  all  differ  in  countless 
ways,  but  that  these  differences  are  nearly  all  matters  of  degree  and 
seldom  present  clear-cut  distinctions  attributable  to  the  segregation 
of  single  genes.  If,  for  example,  we  were  to  classify  individuals  ac- 
cording to  their  height,  we  could  not  put  them  into  groups  labelled 
"tall"  and  "short,"  because  there  are  all  degrees  of  height,  and  a 
division  into  classes  would  be  purely  arbitrary.  Variation  of  this  sort, 
without  natural  discontinuities,  is  called  continuous  variation,  and 
characters  that  exhibit  it  are  called  quantitative  characters  or  metric 
characters,  because  their  study  depends  on  measurement  instead  of 
on  counting.  The  genetic  principles  underlying  the  inheritance  of 
metric  characters  are  basically  those  outlined  in  the  previous  chapters, 
but  since  the  segregation  of  the  genes  concerned  cannot  be  followed 
individually,  new  methods  of  study  have  had  to  be  developed  and 
new  concepts  introduced.  A  branch  of  genetics  has  consequently 
grown  up,  concerned  with  metric  characters,  which  is  called  variously 
population  genetics,  biometrical  genetics  or  quantitative  genetics.  The 
importance  of  this  branch  of  genetics  need  hardly  be  stressed;  most 
of  the  characters  of  economic  value  to  plant  and  animal  breeders  are 
metric  characters,  and  most  of  the  changes  concerned  in  micro- 
evolution  are  changes  of  metric  characters.  It  is  therefore  in  this 
branch  that  genetics  has  its  most  important  application  to  practical 
problems  and  also  its  most  direct  bearing  on  evolutionary  theory. 

How  does  it  come  about  that  the  intrinsically  discontinuous  varia- 
tion caused  by  genetic  segregation  is  translated  into  the  continuous 
variation  of  metric  characters?  There  are  two  reasons:  one  is  the 
simultaneous  segregation  of  many  genes  affecting  the  character,  and 
the  other  is  the  superimposition  of  truly  continuous  variation  arising 
from  non-genetic  causes.   Consider,  for  example,  a  simplified  situa- 
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tion.  Suppose  there  is  segregation  at  six  unlinked  loci,  each  with  two 
alleles  at  frequencies  of  0-5.  Suppose  that  there  is  complete  domin- 
ance of  one  allele  at  each  locus  and  that  the  dominant  alleles  each  add 
one  unit  to  the  measurement  of  a  certain  character.  Then  if  the 
segregation  of  these  genes  were  the  only  cause  of  variation  there  would 


Fig.  6.1.  Distributions  expected  from  the  simultaneous  segrega- 
tion of  two  alleles  at  each  of  several  or  many  loci:  (a)  6  loci,  (b)  24 
loci.  There  is  complete  dominance  of  one  allele  over  the  other  at 
each  locus,  and  the  gene  frequencies  are  all  0-5.  Each  locus,  when 
homozygous  for  the  recessive  allele,  is  supposed  to  reduce  the 
measurement  by  1  unit  in  (a),  and  by  \  unit  in  (b).  The  horizontal 
scale,  representing  the  measurement,  shows  the  number  of  loci 
homozygous  for  the  recessive  allele,  and  the  vertical  axis  shows  the 
probability,  or  the  percentage  of  individuals  expected  in  each  class. 
The  probabilities  are  derived  from  the  binomial  expansion  of 
(i  +  !)w>  where  n  is  the  number  of  loci,  and  they  are  taken  from  the 
tables  of  Warwick  (1932). 

be  7  discrete  classes  in  the  measurements  of  the  character,  according 
to  whether  the  individual  had  the  dominant  allele  present  at  o,  1,  2,  .  .  . 
or  6  of  the  loci.  The  frequencies  of  the  classes  would  be  according  to 
the  binomial  expansion  of  (i  +  |)6,  as  shown  in  Fig.  6.1  (a).  If  our 
measurements  were  sufficiently  accurate  we  should  recognise  these 
classes  as  being  distinct  and  we  should  be  able  to  place  any  individual 
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unambiguously  in  its  class.  If  there  were  more  genes  segregating  but 
each  had  a  smaller  effect,  there  would  be  more  classes  with  smaller 
differences  between  them,  as  in  Fig.  6.1  (b).  It  would  then  be  more 
difficult  to  distinguish  the  classes,  and  if  the  difference  between  the 
classes  became  about  as  small  as  the  error  of  measurement  we  should 
no  longer  be  able  to  recognise  the  discontinuities.  In  addition,  metric 
characters  are  subject  to  variation  from  non-genetic  causes,  and  this 
variation  is  truly  continuous.  Its  effect  is,  as  it  were,  to  blur  the  edges 
of  the  genetic  discontinuity  so  that  the  variation  as  we  see  it  becomes 
continuous,  no  matter  how  accurate  our  measurements  may  be. 

Thus  the  distinction  between  genes  concerned  with  Mendelian 
characters  and  those  concerned  with  metric  characters  lies  in  the 
magnitude  of  their  effects  relative  to  other  sources  of  variation.  A 
gene  with  an  effect  large  enough  to  cause  a  recognisable  discontinuity 
even  in  the  presence  of  segregation  at  other  loci  and  of  non-genetic 
variation  can  be  studied  by  Mendelian  methods,  whereas  a  gene  whose 
effect  is  not  large  enough  to  cause  a  discontinuity  cannot  be  studied 
individually.  This  distinction  is  reflected  in  the  terms  major  gene  and 
minor  gene.  There  are,  however,  all  intermediate  grades,  genes  that 
cannot  properly  be  classed  as  major  or  as  minor,  such  as  the  "bad 
genes"  of  Mendelian  genetics.  And,  furthermore,  as  a  result  of 
pleiotropy  the  same  gene  may  be  classed  as  major  with  respect  to  one 
character  and  minor  with  respect  to  another  character.  The  distinc- 
tion, though  convenient,  is  therefore  not  a  fundamental  one,  and  there 
is  no  good  evidence  that  there  are  two  sorts  of  genes  with  different 
properties.  Variation  caused  by  the  simultaneous  segregation  of 
many  genes  may  be  called  polygenic  variation,  and  the  minor  genes 
concerned  are  sometimes  referred  to  as  polygenes  (see  Mather,  1949). 


Metric  Characters 

The  metric  characters  that  might  be  studied  in  any  higher  organ- 
ism are  almost  infinitely  numerous.  Any  attribute  that  varies  con- 
tinuously and  can  be  measured  might  in  principle  be  studied  as  a 
metric  character — anatomical  dimensions  and  proportions,  physio- 
logical functions  of  all  sorts,  and  mental  or  psychological  qualities. 
The  essential  condition  is  that  they  should  be  measureable.  The 
technique  of  measurement,  however,  sets  a  practical  limitation  on 
what  can  be  studied.    Usually  rather  large  numbers  of  individuals 
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Fig.  6.2.  Frequency  distributions  of  four  metric  characters,  with 
normal  curves  superimposed.  The  means  are  indicated  by  arrows. 
The  characters  are  as  follows,  the  number  of  observations  on  which 
each  histogram  is  based  being  given  in  brackets: 

(a)  Mouse  (<?<?):  growth  from  3  to  6  weeks  of  age.    (380) 

(b)  Mouse:   litter  size   (number  of  live  young  in    1st  litters). 
(689) 

(c)  Drosophila  melanogaster  ($$):  number  of  bristles  on  ventral 
surface  of  4th  and  5th  abdominal  segments,  together.    (900) 

(d)  Drosophila  melanogaster  ($?):  number  of  facets  in  the  eye  of 
the  mutant  "Bar".    (488) 

(a),  (b),  and  (c)  are  from  original  data:  (d)  is  from  data  of  Zeleny 
(1922). 
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have  to  be  measured  and  the  study  of  any  character  whose  measure- 
ment requires  an  elaborate  technique  therefore  becomes  impracti- 
cable. Consequently  the  characters  that  have  been  used  in  studies  of 
quantitative  genetics  are  predominantly  anatomical  dimensions,  or 
physiological  functions  measured  in  terms  of  an  end-product,  such  as 
lactation,  fertility,  or  growth  rate. 

Some  examples  of  metric  characters  are  illustrated  in  Fig.  6.2. 
The  variation  is  represented  graphically  by  the  frequency  distribu- 
tion of  measurements.  The  measurements  are  grouped  into  equally 
spaced  classes  and  the  proportion  of  individuals  falling  in  each  class 
is  plotted  on  the  vertical  scale.  The  resulting  histogram  is  discontinu- 
ous only  for  the  sake  of  convenience  in  plotting.  If  the  class  ranges 
were  made  smaller  and  the  number  of  individuals  measured  were  in- 
creased indefinitely  the  histogram  would  become  a  smooth  curve. 
The  variation  of  some  metric  characters,  such  as  bristle  number  or 
litter  size,  is  not  strictly  speaking  continuous  because,  being  measured 
by  counting,  their  values  can  only  be  whole  numbers.  Nevertheless, 
one  can  regard  the  measurements  in  such  cases  as  referring  to  an 
underlying  character  whose  variation  is  truly  continuous  though 
expressible  only  in  whole  numbers,  in  a  manner  analogous  to  the 
grouping  of  measurements  into  classes.  For  example,  litter  size  may 
be  regarded  as  a  measure  of  the  underlying,  continuously  varying 
character,  fertility.  For  practical  purposes  such  characters  can  be 
treated  as  continuously  varying,  provided  the  number  of  classes  is 
not  too  small.  When  there  are  too  few  classes,  as  for  example  when 
susceptibility  to  disease  is  expressed  as  death  or  survival,  different 
methods  have  to  be  employed,  as  will  be  explained  in  Chapter  18. 

The  frequency  distributions  of  most  metric  characters  approxi- 
mate more  or  less  closely  to  normal  curves.  This  can  be  seen  in 
Fig.  6.2,  where  the  smooth  curves  drawn  through  the  histograms  are 
normal  curves  having  means  and  variances  calculated  from  the  data. 
In  the  study  of  metric  characters  it  is  therefore  possible  to  make  use 
of  the  properties  of  the  normal  distribution  and  to  apply  the  appro- 
priate statistical  techniques.  Sometimes,  however,  the  scale  of 
measurement  must  be  modified  if  a  distribution  approximating  to  the 
normal  is  to  be  obtained.  The  distribution  in  Fig.  6.2  {d\  for  example, 
would  be  skewed  if  measured  and  plotted  simply  as  the  number  of 
facets.  But  it  becomes  symmetrical,  and  approximates  to  a  normal 
distribution,  if  measured  and  plotted  in  logarithmic  units.  The 
criteria  on  which  the  choice  of  a  scale  of  measurement  rests  cannot  be 
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fully  appreciated  at  this  stage,  and  will  be  explained  in  Chapter  17. 
Meantime  it  will  be  assumed  that  any  metric  character  under  dis- 
cussion is  measured  on  an  appropriate  scale  and  has  a  distribution 
that  is  approximately  normal. 


General  Survey  of  the  Subject-matter 

There  are  tw.o^basic  genetic  phenomena  concerned  with  metric 
characters,  botmnore  or'lessjjaqm^arto  aUJiLoiopsts.  and  each  forms 
the  basis  of  a  breeding  method.  The  first  is  the  resemblance  between 
relatives.  Everyone  is  familiar  with  the  fact  that  relatives  tend  to 
resemble  each  other,  and  the  closer  the  relationship,  in  general  the 
closer  the  resemblance.  Though  it  is  only  in  our  own  species  that 
resemblances  are  readily  discernible  without  measurement,  the 
phenomenon  is  equally  present  in  other  species.  The  degree  of 
resemblance  varies  with  the  character,  some  showing  more,  some  less. 
The  resemblance  between  offspring  and  parents  provides  the  basis 
for  selective  breeding.  Use  of  the  more  desirable  individuals  as 
parents  brings  about  an  improvement  of  the  mean  level  of  the  next 
generation,  and  just  as  some  characters  show  more  resemblance  than 
others,  so  some  are  more  responsive  to  selection  than  others.  The 
degree  of  resemblance  between  relatives  is  one  of  the  properties  of  a 
population  that  can  be  readily  observed,  and  it  is  one  of  the  aims  of 
quantitative  genetics  to  show  how  the  degree  of  resemblance  between 
different  sorts  of  relatives  can  be  used  to  predict  the  outcome  of 
selective  breeding  and  to  point  to  the  best  method  of  carrying  out  the 
selection.  This  problem  will  form  the  central  theme  of  the  next 
seven  chapters,  the  resemblance  between  relatives  being  dealt  with  in 
Chapters  9  and  10,  and  the  effects  of  selection  in  Chapters  1 1-13. 

"Jjy^ej^BfUja^j^^gej^ 
with  its  converse  hybrid  vigour,  or  heterosis.  This  phenomenon  is  less 
familiar  to  the  layman  than  the  first,  since  the  laws  against  incest  pre- 
vent its  more  obvious  manifestations  in  our  own  species;  but  it  is  well 
known  to  animal  and  plant  breeders.  Inbreeding  tends  to  reduce  the 
mean  level  of  all  characters  closely  connected  with  fitness  in  animals 
and  in  naturally  outbreeding  plants,  and  to  lead  in  consequence  to  loss 
of  general  vigour  and  fertility.  Since  most  characters  of  economic 
value  in  domestic  animals  and  plants  are  aspects  of  vigour  or  fertility, 
inbreeding  is  generally  deleterious.  The  reduced  vigour  and  fertility 
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of  inbred  lines  is  restored  on  crossing,  and  in  certain  circumstances 
this  hybrid  vigour  can  be  made  use  of  as  a  means  of  improvement. 
The  enormous  improvement  of  the  yield  of  commercially  grown 
maize  has  been  achieved  by  this  means  and  represents  probably  the 
greatest  practical  achievement  of  genetics  (see  Mangelsdorf,  195 1). 
The  effects  of  inbreeding  and  crossing  will  be  described  in  Chapters 
14-16. 

The  properties  of  a  population  that  we  can  observe  in  connexion 
with  a  metric  character  are  means,  variances,  and  covariances.  The 
natural  subdivision  of  the  population  into  families  allows  us  to  analyse 
the  variance  into  components  which  form  the  basis  for  the  measure- 
ment of  the  degree  of  resemblance  between  relatives.  We  can  in 
addition  observe  the  consequences  of  experimentally  applied  breed- 
ing methods,  such  as  selection,  inbreeding  or  cross-breeding.  The 
practical  objective  of  quantitative  genetics  is  to  find  out  how  we  can 
use  the  observations  made  on  the  population  as  it  stands  to  predict 
the  outcome  of  any  particular  breeding  method.  The  more  general 
aim  is  to  find  out  how  the  observable  properties  of  the  population  are 
influenced  by  the  properties  of  the  genes  concerned  and  by  the  various 
non-genetic  circumstances  that  may  influence  a  metric  character.  The 
chief  properties  of  genes  that  have  to  be  taken  account  of  are  the 
degree  of  dominance,  the  manner  in  which  genes  at  different  loci 
combine  their  effects,  pleiotropy,  linkage,  and  fitness  under  natural 
selection.  To  take  account  of  all  these  properties  simultaneously,  in 
addition  to  a  variety  of  non-genetic  circumstances,  would  make  the 
problems  unmanageably  complex.  We  therefore  have  to  simplify 
matters  by  dealing  with  one  thing  at  a  time,  starting  with  the  simpler 
situations. 

The  plan  to  be  followed  in  the  succeeding  chapters  is  this:  we 
shall  first  show  what  determines  the  population  mean,  and  then 
introduce  two  new  concepts — average  effect  and  breeding  value — 
which  are  necessary  to  an  understanding  of  the  variance.  Then  we 
shall  discuss  the  variance,  its  analysis  into  components,  and  the  co- 
variance  of  relatives,  which  will  lead  us  to  the  degree  of  resemblance 
between  relatives.  In  all  this  we  shall  take  full  account  of  dominance 
from  the  beginning:  the  other  complicating  factors  will  be  more 
briefly  discussed  when  they  become  relevant.  The  most  important 
simplification  that  we  shall  make  concerns  the  effect  of  genes  on 
fitness:  we  shall  assume  that  Mendelian  segregation  is  undisturbed 
by  differential  fitness  of  the  genotypes.   The  description  of  means. 
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variances,  and  covariances  will  refer  to  a  random  breeding  popula- 
tion, with  Hardy- Weinberg  equilibrium  genotype  frequencies,  with 
no  selection  and  no  inbreeding.  That  is  to  say,  we  shall  describe  the 
population  before  any  special  breeding  method  is  applied  to  it.  Then 
in  Chapters  n-13  we  shall  describe  the  effects  of  selection,  and  in 
Chapters  14-16  the  effects  of  inbreeding.  This  will  cover  the  funda- 
mentals of  quantitative  genetics,  and  in  the  final  chapters  we  shall 
discuss  some  special  topics. 


CHAPTER   7 


VALUES  AND  MEANS 

We  have  seen  in  the  early  chapters  that  the  genetic  properties  of  a 
population  are  expressible  in  terms  of  the  gene  frequencies  and  geno- 
type frequencies.  In  order  to  deduce  the  connexion  between  these 
on  the  one  hand  and  the  quantitative  differences  exhibited  in  a  metric 
character  on  the  other,  we  must  introduce  a  new  concept,  the  concept 
of  value,  expressible  in  the  metric  units  by  wnichtne  character  is 
mea^gmjed.  The  value  observed  when  the  character  is  measured  on  an 
individual  is  the  phenotypic  value  of  that  individual.  All  observations, 
whether  of  means,  variances,  or  covariances,  must  clearly  be  based  on 
measurements  of  phenotypic  values.  In  order  to  analyse  the  genetic 
properties  of  the  population  we  have  to  divide  the  phenotypic  value 
into  component  parts  attributable  to  different  causes.  Explanation  of 
the  meanings  of  these  components  is  our  chief  concern  in  this  chapter, 
though  we  shall  also  be  able  to  find  out  how  the  population  mean  is 
influenced  by  the  array  of  gene  frequencies. 

The  first  division  of  phenotypic  value  is  into  components  attribut- 
able to  the  influence  of  genotype  and  environment.  The  genotype  is 
the  particular  assemblage  of  genes  possessed  by  the  individual,  and 
the  environment  is  all  the  non-genetic  circumstances  that  influence  the 
phenotypic  value.  Inclusion  of  all  non-genetic  circumstances  under 
the  term  environment  means  that  the  genotype  and  the  environment 
are  by  definition  the  only  determinants  of  phenotypic,  value.  The  two 
components  of  value  associated  with  genotype  and  environment  are 
the  genotypic  value  and  the  environmental  deviation.  We  may  think 
of  the  genotype  conferring  a  certain  value  on  the  individual  and  the 
environment  causing  a  deviation  from  this,  in  one  direction  or  the 
other.  Or,  symbolically, 

P=G+E  (7.1) 

where  P  is  the  phenotypic  value,  G  is  the  genotypic  value,  and  J?  is  the 
environmental  deviation.  The  mean  environmental  deviation  in  the 
population  as  a  whole  is  taken  to  be  zero,  so  that  the  mean  phenotypic 
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value  is  equal  to  the  mean  genotypic  value.  T^heterm  population 
mean  then  refers  equally  to  phenotypic  or  to  genotypic  values.  When 
dealing  with  successive  generations  we  shall  assume  for  simplicity  that 
the  environment  remains  constant  from  generation  to  generation,  so 
that  the  population  mean  is  constant  in  the  absence  of  genetic  change. 
If  we  could  replicate  a  particular  genotype  in  a  number  of  individuals 
and  measure  them  under  environmental  conditions  normal  for  the 
population,  their  mean  environmental  deviations  would  be  zero,  and 
their  mean  phenotypic  value  would  consequently  be  equal  to  the 
genotypic  value  of  that  particular  genotype.  This  is  the  meaning  of 
the  genotypic  value  of  an  individual.  In  principle  it  is  measurable, 
but  in  practice  it  is  not,  except  when  we  are  concerned  with  a  single 
locus  where  the  genotypes  are  phenotypically  distinguishable,  or  with 
the  genotypes  represented  in  highly  inbred  lines. 

For  the  purposes  of  deduction  we  must  assign  arbitrary  values  to 
the  genotypes  under  discussion.  This  is  done  in  the  following  way. 
Considering  a  single  locus  with  two  alleles,  Ax  and  A2,  we  call  the 
genotypic  value  of  one  homozygote  +  a,  that  of  the  other  homozygote 
-  a,  and  that  of  the  heterozygote  d.  (We  shall  adopt  the  convention 
that  Ax  is  the  allele  that  increases  the  value.)  We  thus  have  a  scale  of 
genotypic  values  as  in  Fig.  7.1.  The  origin,  or  point  of  zero  value,  on 
this  scale  is  mid- way  between  the  values  of  the  two  homozygotes. 


Genotype 


A2A2 
l__ 


AjA2     AjAj 


Genotypic         -a  o  d  +a 

value 

Fig.  7.1.  Arbitrarily  assigned  genotypic  values. 

The  value,  d.  of  the  heterozygote  depends  on  the  degree  of  dominance. 
If  there  is  no  dominance,  d  =  o;  if  Ax  is  dominant  over  A2,  dis  positive, 
and  if  A2  is  dominant  over  A1?  d  is  negative.  If  dominance  is  com- 
plete, d  is  equal  to  +a  or  -a,  and  if  there  is  overdominance_  d  is 
greater  than  +  a  or  less  than  -  a.   The  degree  of  dominance  mav  be 

Example  7.1.  For  the  purposes  of  illustration  in  this  chapter,  and  also 
later  on,  we  shall  refer  to  a  dwarfing  gene  in  the  mouse,  known  as  "pygmy' ' 
(symbol  pg),  described  by  King  (1950,  1955),  and  by  Warwick  and  Lewis 
(1954).  This  gene  reduces  body-size  and  is  nearly,  but  not  quite,  recessive 
in  its  effect  on  size.  It  was  present  in  a  strain  of  small  mice  (Mac Arthur's) 
at  the  time  the  studies  cited  above  were  made.  The  weights  of  mice  of  the 
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three  genotypes  at  6  weeks  of  age  were  approximately  as  follows  (sexes 
averaged): 

+  +       +Pg       PgPg 
Weight  in  grams  14  12  6 

(The  weight  of  heterozygotes  given  here  is  to  some  extent  conjectural,  but 
it  is  unlikely  to  be  more  than  1  gm.  in  error.)  These  are  average  weights 
obtained  under  normal  environmental  conditions,  and  they  are  therefore 
the  genotypic  values.  The  mid-point  in  genotypic  value  between  the  two 
homozygotes  is  10  gm.,  and  this  is  the  origin,  or  zero-point,  on  the  scale 
of  values  assigned  as  in  Fig.  7. 1 .  The  value  of  a  on  this  scale  is  therefore 
4  gm.,  and  that  of  d  is  2  gm. 

Population  Mean 

We  can  now  see  how  the  gene  frequencies  influence  the  mean  of 
the  character  in  the  population  as  a  whole.  Let  the  gene  frequencies 
of  A±  and  A2  be  p  and  q  respectively.  Then  the  first  two  columns  of 
Table  7.1  show  the  three  genotypes  and  their  frequencies  in  a  random 
breeding  population,  from  formula  1.2.  The  third  column  shows  the 
genotypic  values  as  specified  above.    The  mean  value  in  the  whole 

Table  7.1 

freq.  x  vol. 
p2a 
2pqd 
-q2a 


Genotype 

Frequency 

Value 

AA 

P2 

+  a 

AXA2 

zpq 

d 

A2A2 

q* 

-a 

Sum  =         a(p  -q)  +  2dpq 


population  is  obtained  by  multiplying  the  value  of  each  genotype  by 
its  frequency  and  summing  over  the  three  genotypes.  The  reason  why 
this  yields  the  mean  value  may  be  understood  by  converting  fre- 
quencies to  numbers  of  individuals.  Multiplying  the  value  by  the 
number  of  individuals  in  each  genotype  and  summing  over  genotypes 
gives  the  sum  of  values  of  all  individuals.  The  mean  value  would  then 
be  this  sum  of  values  divided  by  the  total  number  of  individuals.  The 
procedure  in  working  with  frequencies  is  the  same,  but  since  the  sum 
of  the  frequencies  is  1,  the  sum  of  values  x  frequencies  is  the  mean 
value.  In  other  words,  the  division  by  the  total  number  has  already 
been  made  in  obtaining  the  frequencies.  Multiplication  of  values  by 
frequencies  to  obtain  the  mean  value  is  a  procedure  that  will  be  often 
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used  in  this  chapter  and  subsequent  ones.  Returning  to  the  popula- 
tion mean,  multiplication  of  the  value  by  the  frequency  of  each 
genotype  is  shown  in  the  last  column  of  Table  7.1.  Summation  of 
this  column  is  simplified  by  noting  that  p2  -  q2  =  (p+q)(p  -q)=p-  q- 
The  population  mean,  which  is  the  sum  of  this  column,  is  thus 


flif—  i(ft Ul  I  zdpq 


■{7-2) 


This  is  both  the  mean  genotypic  value  and  the  mean  phenotypic 
value  of  the  population  with  respect  to  the  character. 

The  contribution  of  any  locus  to  the  population  mean  thus  has 
two  terms:  a(p  -  q)  attributable  to  homozygotes,  and  zdpq  attributable 
to  heterozygotes.  If  there  is  no  dominance  (d=o)  the  second  term  is 
zero,  and  the  mean  is  proportional  to  the  gene  frequency:  M= a(i  -  2q). 
If  there  is  complete  dominance  (d=a)  the  mean  is  proportional  to  the 
square  of  the  gene  frequency:  M=a(i  -2q2).  The  total  range  of 
values  attributable  to  the  locus  is  2a,  in  the  absence  of  overdominance. 
That  is  to  say,  if  Ax  were  fixed  in  the  population  (p  =  i)  the  popula- 
tion mean  would  be  a,  and  if  A2  were  fixed  (q=i)  it  would  be  -  a. 
If  the  locus  shows  overdominance,  however,  the  mean  of  an  unfixed 
population  is  outside  this  range. 

Example  7.2.  Let  us  take  again  the  pygmy  gene  in  mice,  as  described 
in  Example  7.1,  and  see  what  effect  this  gene  would  have  on  the  population 
mean  when  present  at  two  particular  frequencies.  First,  the  total  range  is 
from  6  gm.  to  14  gm.:  a  population  consisting  entirely  of  pygmy  homo- 
zygotes would  have  a  mean  of  6  gm.,  and  one  from  which  the  gene  was 
entirely  absent  would  have  a  mean  of  14  gm.  (These  values  refer  speci- 
fically to  MacArthur's  Small  Strain  at  the  time  the  observations  were 
made.)  Now  suppose  the  gene  were  present  at  a  frequency  of  o-i,  so  that 
under  random  mating  homozygotes  would  appear  with  a  frequency  of  1 
per  cent.  The  values  to  be  substituted  in  equation  7.2  are  p  =  o-g,  q  =  o-i} 
and  a  =  4  gm.,  d  =  2  gm.,  as  shown  in  Example  7.1.  The  population  mean, 
by  equation  7.2,  is  therefore:  M  =  \  x  o-8  +  2  x  o- 18  =  3-56.  This  value  of 
the  mean,  however,  is  measured  from  the  mid-homozygote  point,  which  is 
10  gm.,  as  origin.  Therefore  the  actual  value  of  the  population  mean  is 
13-56  gm.  Next  suppose  the  gene  were  present  at  a  frequency  of  0-4. 
Substituting  in  the  same  way,  we  find  M  — 176,  to  which  must  be  added 
10  gm.  for  the  origin,  giving  a  value  of  11-76  gm.  Rough  corroboration  of 
these  figures  is  given  by  the  records  of  the  strain  carrying  the  gene. 
When  the  gene  was  present  at  a  frequency  of  about  0-4  the  mean  weight 
was  about  12  gm.  Two  generations  later,  when  the  pygmy  gene  had  been 
deliberately  eliminated,  the  mean  weight  rose  to  about  14  gm. 
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Now  we  have  to  put  together  the  contributions  of  genes  at  several 
loci  and  find  their  joint  effect  on  the  mean.  This  introduces,  the 
qiiegfliflQ  nf  ^nw  8m£^d4jlifegn^Qc^oinbirietg  produce  a  joint 
efl^cj^nthgjjjjgra^ter.  For  the  moment  we  shall  suppose  that  com- 
bination isJiv  addition,  which  means  that  the  value  of  a  genotype 
with  respect  to  several  loci  is  the  sum  of  the  values  attributable  to  the 
separate  loci.  For  example,  if  the  genotypic  value  of  A^  is  aA  and 
that  of  B1B1  is  aBy  then  the  genotypic  value  of  AjA^B].  is  aA  +aB. 
The  consequences  of  non-additive  combination  will  be  explained  at 
the  end  of  this  chapter.  With  additive  combination,  then,  the  popu- 
lation mean  resulting  from  the  joint  effects  of  several  loci  is  the  sum  of 
the  contributions  of  each  of  the  separate  loci,  thus: 


M=Za(p-q)  +  2Zdpq 


.(7-5) 


This  is  again  both  the  genotypic  and  the  phenotypic  mean  value.  The 
total  range  in  the  absence  of  overdominance  is  now  2Ua.  If  all  alleles 
that  increase  the  value  were  fixed  the  mean  would  be  +  £a,  and  if  all 
alleles  that  decrease  the  value  were  fixed  it  would  be  -  Ea.  These  are 
the  theoretical  limits  to  the  range  of  potential  variation  in  the  popula- 
tion. The  origin  from  which  the  mean  value  in  equation  7.5  is 
measured  is  the  mid-point  of  the  total  range.  This  is  equivalent  to 
the  average  mid-homozygote  point  of  all  the  loci  separately. 

Example  7.3.  As  an  example  of  two  loci  that  combine  additively,  and 
also  of  their  joint  effects  on  the  population  mean,  we  shall  refer  to  two 
colour  genes  in  mice,  whose  effects  on  the  number  of  pigment  granules 
have  been  described  by  Russell  (1949).  This  is  a  metric  character  which 
reflects  the  intensity  of  pigmentation  in  the  coat.  The  two  genes  are 
"brown"  (b)  and  "extreme  dilution"  (ce),  an  allele  of  the  albino  series. 
Measurements  were  made  of  the  number  of  melanin  granules  per  unit 
volume  of  hair,  in  wild-type  homozygotes,  in  the  two  single  mutant  homo- 
zygotes,  and  in  the  double  mutant  homozygote.  We  shall  assume  both 
wild-type  alleles  to  be  completely  dominant,  so  that  only  these  four  geno- 
types need  be  considered.  The  mean  numbers  of  granules  in  the  four 
genotypes  were  as  follows: 


B- 

bb 

2aB 

c- 

cece 

95 
38 

90 
34 

5 
4 

2flc 

57 

56 

Chap.  7] 


POPULATION  MEAN 


117 


The  difference  between  the  two  figures  in  each  row  and  in  each  column 
measures  the  homozygote  difference,  or  2a  on  the  scale  of  values  assigned 
as  in  Fig.  7.1.  Apart  from  the  trivial  discrepancy  of  1  unit,  these  differences 
are  independent  of  the  genotype  at  the  other  locus.  In  other  words,  the 
difference  of  value  between  B  -  and  bb  is  the  same  among  C  -  genotypes 
as  it  is  among  cece  genotypes;  and  similarly  the  difference  between  C  -  and 
cece  is  the  same  in  B  -  as  it  is  in  bb.  Thus  the  two  loci  combine  addi- 
tively,  and  the  value  of  a  composite  genotype  can  be  rightly  predicted 
from  knowledge  of  the  values  of  the  single  genotypes.  For  example:  the 
bb  genotype  is  5  units  less  than  the  wild-type,  and  the  cece  is  57  units  less; 
therefore  bb  cece  should  be  62  units  less,  namely  33,  which  is  almost  iden- 
tical with  the  observed  value  of  34. 

We  may  use  this  example  further  to  illustrate  the  effect  of  the  two 
loci  jointly  on  the  population  mean.  Let  us  work  out,  from  the  effects  of 
the  loci  taken  separately,  what  would  be  the  mean  granule  number  in  a 
population  in  which  the  frequency  of  bb  was  ql  =  0-4,  and  that  of  cece 
was  ql  =  o-2.  For  the  effects  of  the  loci  separately  we  shall  take  aB  =  2  and 
flc  =  28.  The  population  mean,  considering  one  locus,  is  M  =  a(i  -2q2), 
when  there  is  complete  dominance.  For  the  B  locus  this  is  MB  =  2  x  0-2 
=  0-4;  and  for  the  C  locus  Mc  =  28  xo-6  =  i6-8.  The  mean,  considering  both 
loci  together,  is  MB  +  Mc  =  17-2  (by  equation  7.3).  The  point  of  origin  from 
which  this  is  measured  is  the  mid-point  between  the  two  double  homo- 
zygotes,  which  is  ^(95  +  34)  =  64-5.  Thus  the  mean  granule  number  in  this 
population  would  be  64-5  +  17-2  =  81-7.  We  may  check  this  from  the  ob- 
servations of  the  values  of  the  joint  genotypes.  The  four  genotypes  would 
have  the  following  frequencies  and  observed  values: 


Genotype 

B-  C- 

B  -  cece 

bbC- 

bb  cece 

Frequency 

0-48 

0-12 

0-32 

0-08 

Observed  value 

95 

38 

90 

34 

The  mean  value  is  obtained  by  multiplying  the  values  by  the  frequencies 
and  summing  over  the  four  genotypes.  This  yields  a  mean  granule  number 
of8i-68. 


Average  Effect 

In  order  to  deduce  the  properties  of  a  population  connected  with 
its  family  structure  we  have  to  deal  with  the  transmission  of  value 
from  parent  to  offspring,  and  this  cannot  be  done  by  means  of  geno- 
typic  values  alone,  because  parents  pass  on  their  genes  and  not  their 
genotypes  to  the  next  generation,  genotypes  being  created  afresh  in 
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each  generation.  A  new  measure  of  value  is  therefore  needed  which 
will  refer  to  genes  and  not  to  genotypes.  This  will  enable  us  to  assign 
a  ''breeding  value"  to  individuals,  a  value  associated  with  the  genes 
carried  by  the  individual  and  transmitted  to  its  offspring.  The  new 
measure  is  the  "average  effect."  We  can  assign  an  average  effect  to  a 
gene  in  the  population,  or  to  the  difference  between  one  gene  and 
another  of  an  allelic  pair.  The  average  effect  of  a  gene  is  the  mean 
deviation  from  the  population  mean  of  individuals  which  received 
that  gene  from  one  parent,  the  gene  received  from  the  other  parent 
having  come  at  random  from  the  population.  This  may  be  stated  in 
another  way.  Let  a  number  of  gametes  all  carrying  A±  unite  at  ran- 
dom with  gametes  from  the  population;  then  the  mean  deviation  from 
the  population  mean  of  the  genotypes  so  produced  is  equal  to  the 
average  effect  of  the  gene  Ax.  The  concept  of  average  effect  is  perhaps 
easier  to  grasp  in  the  form  of  the  average  effect  of  a  gene-substitution, 
which  can  more  conveniently  be  used  when  only  two  alleles  at  a  locus 
are  under  consideration.  If  we  could  change,  say,  A2  genes  into  Ax  at 
random  in  the  population,  and  could  then  note  the  resulting  change  of 
value,  this  would  be  the  average  effect  of  the  gene-substitution.  It  is 
equal  to  the  difference  between  the  average  effects  of  the  two  genes 
involved  in  the  substitution.  A  graphical  representation  of  the  average 
effect  of  a  gene-substitution  is  given  later  in  Fig.  7.2. 

It  is  important  to  realise  that  the  average  effect  of  a  gene  or  a  gene- 
substitution  depends  on  the  gene  frequency,  and  that  the  average 
effect  is  therefore  a  property  of  the  population  as  well  as  of  the  gene. 
The  reason  for  this  can  be  seen  in  the  words  "taken  at  random"  in  the 
definitions,  because  the  content  of  the  random  sample  depends  on  the 
gene  frequency  in  the  population.   The  point  may  perhaps  be  more 
easily  understood  from  a  specific  example.  Consider  the  substitution 
of  a  recessive  gene,  a,  for  its  dominant  allele,  A.  The  substitution  will 
change  the  value  only  when  the  individual  already  carries  one  reces- 
sive allele,  in  other  words  in  heterozygotes.    Changing  AA  into  Aa 
will  not  affect  the  value,  but  changing  Aa  into  aa  will.  Now,  when  the 
frequency  of  the  recessive  allele,  a,  is  low  there  will  be  many  AA 
individuals,  which  the  substitution  will  not  affect;  but  when  the 
recessive  is  at  high  frequency  there  will  be  very  few  AA  individuals, 
and  most  of  the  individuals  in  which  a  substitution  can  be  made  will 
be  affected  by  it.  Therefore  the  average  effect  of  the  substitution  will 
be  small  when  the  frequency  of  the  recessive  allele  is  low  and  large 
when  it  is  high. 
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Let  us  see  how  the  average  effect  is  related  to  the  genotypic 
values,  a  and  d,  in  terms  of  which  the  population  mean  was  expressed. 
This  will  help  to  make  the  concept  clearer.  The  reasoning  is  set  out 
in  Table  7.2.  Consider  a  locus  with  two  alleles,  A±  and  A2,  at  fre- 
quencies p  and  q  respectively,  and  take  first  the  average  effect  of  the 

Table  7.2 


Type  of 
gamete 

Values  and 

frequencies  of 

genotypes  produced 

A^Aj      A1A2      A^A^ 
a           d          -a 

Mean  value 

of  genotypes 

produced 

Population  mean 
to  be  deducted 

Average  effect 
of  gene 

A, 
A2 

P        q 

P        q 

pa  +qd 
-qa  +pd 

-[a(p-q)+2dpq] 

-[a(p-q)+2dpq] 

q[a+d(q-p)] 
-p[a+d(q-p)] 

gene  Aly  for  which  we  shall  use  the  symbol  ax.  If  gametes  carrying  At 
unite  at  random  with  gametes  from  the  population,  the  frequencies 
of  the  genotypes  produced  will  be  p  of  A^!  and  q  of  AXA2.  The 
genotypic  value  of  AjAj  is  +  a  and  that  of  AXA2  is  d,  and  the  mean  of 
these,  taking  account  of  the  proportions  in  which  they  occur,  is 
pa+qd.  The  difference  between  this  mean  value  and  the  population 
mean  is  the  average  effect  of  the  gene  A±.  Taking  the  value  of  the 
population  mean  from  equation  7.2  we  get 

ai  =Pa  +qd-  [a(p  -q)  +  zdpq] 

=q[a  +  d(q-p)]  (7.4a) 

Similarly  the  average  effect  of  the  gene  A2  is 

cc2=-p[a  +  d(q-p)]  (7.4b) 

Now  consider  the  average  effect  of  the  gene-substitution,  letting  Ax 
be  substituted  for  A2.  Of  the  A2  genes  taken  at  random  from  the 
population  for  substitution,  a  proportion  p  will  be  found  in  AXA2 
genotypes  and  a  proportion  q  in  A2A2  genotypes.  In  the  former  the 
substitution  will  change  the  value  from  d  to  +a,  and  in  the  latter 
from  -a  to  d.  The  average  change  is  therefore  p(a-d)+q(dJrd), 
which  on  rearrangement  becomes  a  +  d(q-p).  Thus  the  average 
effect  of  the  gene-substitution  (written  as  a,  without  subscript)  is 


<x  =  a  +  d(q-p) 


(7-5) 


The  relation  of  a  to  ax  and  <%2  can  be  seen  by  comparing  equations 
7.5  and  7.4,  whence 

I  F.Q.G. 
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oc  =  oc1-a2  (7.6) 


} (7-7) 


and 

oc1=q<x 
a2=  -poc 

Example  7.4.  Consider  again  the  pygmy  gene  and  its  effect  on  body 
weight,  for  which  a  =  4  gm.  and  d  =  2  gm.  If  the  frequency  of  the  pg  gene 
were  #  =  o-i,  the  average  effect  of  substituting  +  for  pg  would  be,  by 
equation  7.5,  <x  =  4  +  2x  -0-8  =  2-4  §m-  And  if  the  frequency  were 
q  =  0-4,  the  average  effect  of  the  gene-substitution  would  be:  a  =  4  +  2  x  -  o>2 
=  3-6  gm.  Thus,  the  average  effect  is  greater  when  the  gene  frequency  is 
greater.  The  average  effects  of  the  genes  separately  are,  by  equation  7.7: 

q  =  o-i        <7  =  o-4 


Average  effect  of  +  :  oc1=     +0-24        +!*44 
Average  effect  of  pg  :  a2  =      -2-16        -2- 16 

(The  identity  of  the  average  effects  of  pg  at  the  two  gene  frequencies  is 
only  a  coincidence.) 


Breeding  Value 

The  usefulness  of  the  concept  of  average  effect  arises  from  the  fact, 
already  noted,  that  parents  pass  on  their  genes  and  not  their  genotypes 
to  their  progeny.  It  is  therefore  the  average  effects  of  the  parent's 
genes  that  determine  the  mean  genotypic  value  of  its  progeny.  The 
value  of  an  individual,  judged  by  the  mean  value  of  its  progeny,  is 
called  the  breeding  value  of  the  individual.  Breeding  value,  unlike 
average  effect,  can  therefore  be  measured.  If  an  individual  is  mated 
to  a  number  of  individuals  taken  at  random  from  the  population  then 
its  breeding  value  is  twice  the  mean  deviation  of  the  progeny  from 
the  population  mean.  The  deviation  has  to  be  doubled  because  the 
parent  in  question  provides  only  half  the  genes  in  the  progeny,  the 
other  half  coming  at  random  from  the  population.  Breeding  values 
can  be  expressed  in  absolute  units,  but  are  usually  more  conveniently 
expressed  in  the  form  of  deviations  from  the  population  mean,  as 
defined  above.  Just  as  the  average  effect  is  a  property  of  the  gene 
and  the  population  so  is  the  breeding  value  a  property  of  the  individual 
and  the  population  from  which  its  mates  are  drawn.  One  cannot 
speak  of  an  individual's  breeding  value  without  specifying  the  popu- 
lation in  which  it  is  to  be  mated. 
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Defined  in  terms  of  average  effects,  the  breeding  value  of  an 
individual  is  equal  to  the  sum  of  the  average  effects  of  the  genes  it 
carries,  the  summation  being  made  over  the  pair  of  alleles  at  each 
locus  and  over  all  loci.  Thus,  for  a  single  locus  with  two  alleles  the 
breeding  values  of  the  genotypes  are  as  follows: 

Genotype  Breeding  value 
A^  2ax  =  2qoc 

AXA2  a1  +  ot2  =  (q-p)ac 
A2A2  2a2  =  —  2/)a 

Example  7.5.  Let  us  illustrate  breeding  values  by  reference  to  the 
pygmy  gene  in  mice.  The  average  effects  of  the  +  and  pg  genes  were 
given  in  the  last  example.  From  these  we  may  find  the  breeding  values  of 
the  three  genotypes  as  explained  above.  These  breeding  values,  which  are 
given  below,  are  deviations  from  the  population  mean.  The  population 
means  with  gene  frequencies  of  o-i  and  0-4  were  found  in  Example  7-2  and 
are  shown  again  below  in  the  column  headed  M. 


q  =  o-i 
2  =  o-4 


M 


i3'56 
1 1 76 


Breeding  values 
+  +  +Pg         PgPg 


+  0-48 
+  2-88 


-1-92 

-072 


-4-32 
-4*32 


(The  breeding  values  of  pygmy  homozygotes  are  only  hypothetical 
because  in  fact  pygmy  homozygotes  are  nearly  all  sterile:  but  this  compli- 
cation may  be  overlooked  in  the  present  context.) 


Extension  to  a  locus  with  more  than  two  alleles  is  straightforward, 
the  breeding  value  of  any  genotype  being  the  sum  of  the  average 
effects  of  the  two  alleles  present.  If  all  loci  are  to  be  taken  into 
account,  the  breeding  value  of  a  particular  genotype  is  the  sum  of  the 
breeding  values  attributable  to  each  of  the  separate  loci.  If  there  is 
non-additive  combination  of  genotypic  values  a  slight  complication 
arises.  We  have  given  two  definitions  of  breeding  value,  a  practical 
one  in  terms  of  the  measured  value  of  the  progeny  and  a  theoretical 
one  in  terms  of  average  effects.  Non-additive  combination  renders 
these  two  definitions  not  quite  equivalent.  This  point  will  be  more 
fully  explained  in  Chapter  9. 

Consideration  of  the  definition  of  breeding  value  will  show  that 
in  a  population  in  Hardy- Weinberg  equilibrium  the  mean  breeding 
value  must  be  zero;  or  if  breeding  values  are  expressed  in  absolute 


\ 
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units  the  mean  breeding  value  must  be  equal  to  the  mean  genotypic 
value  and  to  the  mean  phenotypic  value.  This  can  be  verified  from 
the  breeding  values  listed  above.  Multiplying  the  breeding  value  by 
the  frequency  of  each  genotype  and  summing  gives  the  mean  breeding 
value  (expressed  as  a  deviation  from  the  population  mean)  as 

2p2q<x  +  2pq(q  -p)<x.  -  2q2poc  =  2pqoc(p  +  q-p-q)  =  o 

The  breeding  value  is  sometimes  referred  to  as  the  "additive 
genotype,"  and  variation  in  breeding  value  ascribed  to  the  "additive 
effects"  of  genes.  Though  we  shall  not  use  these  terms  we  shall 
follow  custom  in  using  the  term  "additive"  in  connexion  with  the 
variation  of  breeding  values  to  be  discussed  in  the  next  chapter,  and 
we  shall  use  the  symbol  A  to  designate  the  breeding  value  of  an 
individual. 


Dominance  Deviation 

We  have  separated  off  the  breeding  value  as  a  component  part  of 
the  genotypic  value  of  an  individual.  Let  us  consider  now  what 
makes  up  the  remainder.  When  a  single  locus  only  is  under  con- 
sideration, the  difference  between  the  genotypic  value,  G,  and  the 
breeding  value,  A,  of  a  particular  genotype  is  known  as  the  dominance 
deviation  D,  so  that 

G=A+D  {7.8) 

The  dominance  deviation  arises  from  the  property  of  dominance 
among  the  alleles  at  a  locus,  since  in  the  absence  of  dominance  breed- 
ing values  and  genotypic  values  coincide.  From  the  statistical  point 
of  view  the  dominance  deviations  are  interactions  between  alleles,  or 
within-locus  interactions.  They  represent  the  effect  of  putting  genes 
together  in  pairs  to  make  genotypes;  the  effect  not  accounted  for  by 
the  effects  of  the  two  genes  taken  singly.  Since  the  average  effects  of 
genes  and  the  breeding  values  of  genotypes  depend  on  the  gene 
frequency  in  the  population,  the  dominance  deviations  are  also 
dependent  on  gene  frequency.  They  are  therefore  partly  properties 
of  the  population  and  are  not  simply  measures  of  the  degree  of 
dominance. 

Example  7.6.  Continuing  with  the  example  of  the  pygmy  gene,  we 
may  now  list  the  genotypic  values  and  the  breeding  values,  and  so  obtain 
the  dominance  deviations  of  the  three  genotypes,  by  equation  J.8.   These 


DOMINANCE  DEVIATION 


123 


Chap.  7] 

values,  all  now  expressed  as  deviations  from  the  population  mean,  M,  are 
as  follows: 

?  =  o-i:M=i3-56 

+  +       +Pg     PgPg 


Frequency 
Genotypic  value,  G 
Breeding  value,  A 
Dominance  dev.,  D 


o-8i  o-i8  o-oi 

+  o«44  -1-56  -7-56 

+  0-48  -1-92  -4*32 

-0-04  +0-36  -3-24 


q  =  0\ 

j.:  M=ii"j6 

+  + 

+  Pg 

PgPg 

0-36 

0-48 

0-16 

+  2-24 

+  0-24 

-576 

+  2-88 

-072 

-4'32 

-0-64 

+  0-96 

-1-44 

The  relations  between  genotypic  values,  breeding  values  and 
dominance  deviations  can  be  illustrated  graphically,  as  in  Fig.  7.2, 
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Fig.  7.2.  Graphical  representation  of  genotypic  values  (closed 
circles),  and  breeding  values  (open  circles),  of  the  genotypes  for  a 
locus  with  two  alleles,  Ax  and  A2,  at  frequencies  p  and  q,  as  ex- 
plained in  the  text.  Horizontal  scale:  number  of  Ax  genes  in  the 
genotype.  Vertical  scales  of  value:  on  left— arbitrary  values  as- 
signed as  in  Fig.  7.1;  on  right — deviations  from  the  population 
mean.    The  figure  is  drawn  to  scale  for  the  values:  d  —  la,  and  q=\. 
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and  the  meaning  of  the  dominance  deviation  is  perhaps  more  easily 
understood  in  this  way.  In  the  figure  the  genotypic  value  (black  dots) 
is  plotted  against  the  number  of  Ax  genes  in  the  genotype.  A  straight 
regression  line  is  fitted  by  least  squares  to  these  points,  each  point 
being  weighted  by  the  frequency  of  the  genotype  it  represents.  The 
position  of  this  line  gives  the  breeding  values  of  each  genotype,  as 
shown  by  the  open  circles.  The  differences  between  the  breeding 
values  and  the  genotypic  values  are  the  dominance  deviations,  indi- 
cated by  vertical  dotted  lines.  The  cross  marks  the  population  mean. 
The  average  effect,  a,  of  the  gene-substitution  is  given  by  the  differ- 
ence in  breeding  value  between  A2A2  and  A^,  or  between  AXA2  and 
AjAi,  as  indicated.  The  original  definition  of  the  average  effect  of  a 
gene-substitution  was  given  by  Fisher  (191 8,  1941)  in  terms  of  this 
linear  regression  of  genotypic  value  on  number  of  genes. 

The  dominance  deviation  can  be  expressed  in  terms  of  the  arbi- 
trarily assigned  genotypic  values  a  and  d,  by  subtraction  of  the  breed- 
ing value  from  the  genotypic  value,  as  shown  in  Table  7.3.    The 

Table  7.3 

Values  of  genotypes  in  a  two-allele  system,  measured  as 
deviations  from  the  population  mean. 
Population  mean:  M=a(p  -q)  +  2dpq 
Average  effect  of  gene-substitution:  a  =  a  +  d(q  -p) 


Genotypes 

AA 

AiA2 

A2A2 

Frequencies 

p* 

zpq 

?2 

Assigned  values 

a 

d 

-a 

Deviations  from 

population-mean: 

Genotypic  value 

{ 

2q{a  -pd) 
2q(a.  -  qd) 

a(q-p)  +  d(i-2pq) 
(q-p)a  +  2pqd 

-2p(a-qd) 
-  2p(a  +pd) 

Breeding  value 

2q<x 

(q-p)<x 

-2poc 

Dominance  deviation 

-2<fd 

2pqd 

-2p2d 

genotypic  values  must  first  be  converted  to  deviations  from  the 
population  mean,  because  the  breeding  values  have  been  expressed 
in  this  way.  The  genotypic  values,  so  converted,  are  given  in  two 
forms:  in  terms  of  a  and  in  terms  of  a.  Let  us  take  the  genotype  A^Aj. 
to  show  how  these  are  obtained  and  how  the  dominance  deviation  is 
obtained  by  subtraction  of  the  breeding  value.  The  arbitrarily  as- 
signed genotypic  value  of  h1A1  is  +  a,  and  the  population  mean  is 
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a(p  —q)  +  zdpq.  Expressed  as  a  deviation  from  the  population  mean, 
the  genotypic  value  is  therefore 

a  -  [a(p  -q)  +  zdpq]  =a(i-p+q)-  zdpq  —  zqa  -  zdpq  =  zq(a  -  dp). 

This  may  be  expressed  in  terms  of  the  average  effect,  a,  by  substitut- 
ing a  =  a-  d(q  -p)  (from  equation  7.5),  and  the  genotypic  value  then 
becomes  zq(oc  -  qd).  Subtraction  of  the  breeding  value,  zq<x,  gives  the 
dominance  deviation  as  -  zq2d.  By  similar  reasoning  the  dominance 
deviation  of  AXA2  is  zpqd,  and  that  of  A2A2  is  -  zp2d.  Thus  all  the 
dominance  deviations  are  functions  of  d.  If  there  is  no  dominance  d 
is  zero  and  the  dominance  deviations  are  also  all  zero.  Therefore  in 
the  absence  of  dominance,  breeding  values  and  genotypic  values  are 
the  same.  Genes  that  show  no  dominance  (d=o)  are  sometimes  called 
"additive  genes,"  or  are  said  to  "act  additively." 

Since  the  mean  breeding  value  and  the  mean  genotypic  value  are 
equal,  it  follows  that  the  mean  dominance  deviation  is  zero.  This  can 
be  verified  by  multiplying  the  dominance  deviation  by  the  frequency 
of  each  genotype  and  summing.  The  mean  dominance  deviation  is 
thus 

-  zp2q2d  +  4p2q2d  -  zp2q2d — o 

Another  fact,  which  will  be  needed  later  when  we  deal  with 
variances,  may  be  noted  here:  there  is  no  correlation  between  the 
dominance  deviation  and  the  breeding  value  of  the  different  genotypes. 
This  can  be  shown  by  multiplying  together  the  dominance  deviation, 
the  breeding  value  and  the  frequency  of  each  genotype.  Summation 
gives  the  sum  of  cross-products,  and  it  works  out  to  be  zero,  thus: 

-  4p2q3ocd  +  4p2q2(q  -p)ocd  +  4p3q2ad=4p2q2ad(-q+q  -p  +p)  =  o 

Since  the  sum  of  cross-products  is  zero,  breeding  values  and  domin- 
ance deviations  are  uncorrelated. 


Interaction  Deviation 

When  only  a  single  locus  is  under  consideration  the  genotypic 
value  is  made  up  of  the  breeding  value  and  the  dominance  deviation 
only.  But  when  the  genotype  refers  to  more  than  one  locus  the  geno- 
typic value  may  contain  an  additional  deviation  due  to  non-additive 
combination.  Let  GA  be  the  genotypic  value  of  an  individual  attri- 
butable to  one  locus,  GB  that  attributable  to  a  second  locus,  and  G  the 
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aggregate  genotypic  value  attributable  to  both  loci  together.  Then 


G  =  GX+G^  +  I 


AB 


■(7-9) 


where  7AB  is  the  deviation  from  additive  combination  of  these  geno- 
typic values.  In  dealing  with  the  population  mean,  earlier  in  this 
chapter,  we  assumed  that  I  was  zero  for  all  combinations  of  geno- 
types. If  /  is  not  zero  for  any  combination  of  genes  at  different  loci, 
those  genes  are  said  to  "interact"  or  to  exhibit  "epistasis,"  the  term 
epistasis  being  given  a  wider  meaning  in  quantitative  genetics  than 
in  Mendelian  genetics.  The  deviation  /  is  called  the  interaction 
deviation  or  epistatic  deviation.  Loci  may  interact  in  pairs  or  in  threes 
or  higher  numbers,  and  the  interactions  may  be  of  many  different 
sorts,  as  the  behaviour  of  major  genes  shows.  The  complex  nature  of 
the  interactions,  however,  need  not  concern  us,  because  in  the  aggre- 
gate genotypic  value  interactions  of  all  sorts  are  treated  together  as  a 
single  interaction  deviation.  So  for  all  loci  together  we  can  write 

G=A  +  D  +  I  (7.10) 

where  A  is  the  sum  of  the  breeding  values  attributable  to  the  separate 
loci,  and  D  is  the  sum  of  the  dominance  deviations. 

If  the  interaction  deviation  is  zero  the  genes  concerned  are  said  to 
"act  additively"  between  loci.  Thus  "additive  action"  may  mean  two 
different  things.  Referred  to  genes  at  one  locus  it  means  the  absence 
of  dominance,  and  referred  to  genes  at  different  loci  it  means  the 
absence  of  epistasis. 

Example  7.7.  As  an  example  of  non-additive  combination  of  two  loci 
we  shall  take  the  same  two  colour  genes  in  mice  that  were  used  in  Example 
7.3  to  illustrate  additive  combination;  but  this  time  we  refer  to  their  effects 
on  the  size  of  the  pigment  granules,  instead  of  their  number  (Russell, 
1949).  The  mean  size  (diameter  in  fj,)  of  the  granules  in  the  four  geno- 
types was  as  follows: 


B- 

bb 

Diff. 

C- 

cece 

1-44 

0-94 

077 
077 

0-67 
0-17 

Diff. 

0-50 

o-oo 

This  time  the  differences  are  not  independent  of  the  other  genotype:  the 
ce  gene  for  example  has  quite  a  large  effect  on  the  B  -  genotype,  but  none 
at  all  on  the  bb  genotype.  Thus  the  two  loci  show  epistatic  interaction  and 
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do  not  combine  additively.  Let  us  therefore  work  out  the  interaction 
deviations.  This  is  not  altogether  a  straightforward  matter  because  the 
deviations  depend  on  the  gene  frequencies  in  the  population  under  dis- 
cussion; it  does,  however,  help  to  clarify  the  meaning  of  the  interaction 
deviations. 

If  we  were  to  measure  the  homozygote  differences  of  these  two  loci 
with  the  object  of  estimating  the  value  of  a  for  each,  the  results  would 
depend  on  the  gene  frequency  at  the  other  locus.  For  example,  the  differ- 
ence between  B  -  and  bb  would  be  0-67  if  measured  in  C  -  genotypes,  but 
0-17  if  measured  in  cece  genotypes.  The  value  of  a  therefore  depends  on 
the  population  in  which  it  is  measured.  Let  us  take,  for  the  sake  of  illus- 
tration, a  population  in  which  the  frequency  of  bb  genotypes  is  ^  =  0-4 
and  the  frequency  of  cece  genotypes  is  q%  =  0-2.  Then  the  mean  homo- 
zygote difference  for  the  B  locus  will  be  2«B  =  (0-67  x  o-8)  +  (0-17  x  0-2)  = 
0-57.  Similarly,  for  the  C  locus,  2flc=o*30.  The  object  now  is  to  find  for 
each  genotype  the  aggregate  genotypic  value,  G,  for  the  two  loci  combined 
(i.e.  the  observed  values  given  above);  then  the  genotypic  values,  GB  and 
GCi  derived  from  consideration  of  the  two  loci  separately;  and,  finally,  the 
interaction  deviation,  IBC,  according  to  equation  y.g.  The  procedure  is 
simplified  if  all  these  values  are  expressed  as  deviations  from  the  popula- 
tion mean.  The  table  gives,  in  line  (1),  the  four  genotypes  (assuming  again 
complete  dominance  at  both  loci);  in  line  (2),  the  frequency  of  each  geno- 
type in  the  population;  and  in  line  (3),  the  observed  value  of  granule  size 
in  each  genotype.  The  population  mean  is  found  by  multiplying  the  value 
by  the  frequency  of  each  genotype  and  summing  over  the  four  genotypes. 
This  yields  M=  1-112.  Subtracting  the  population  mean  from  the  ob- 
served value  gives  the  aggregate  genotypic  value,  G,  as  a  deviation  from 
the  population  mean,  shown  in  line  (4).   Now  consider  each  locus  separ- 


(1)  Genotypes 

B-  C- 

B-  cece 

bbC- 

bb  cece 

Mean 

(2)  Frequencies 

0-48 

0*12 

0-32 

0-08 

(3)  Observed  values 

1-44 

0-94 

0-77 

0-77 

I-II2 

(4)              G 

+  0-328 

-0-172 

-0-342 

-0-342 

O 

(5)         GB  +  GC 

+  0-288 

-0-0I2 

-0-282 

-0-582 

O 

(6)             /bo 

+  0-040 

-0-160 

-  0-060 

+  0-240 

O 

ately,  paying  no  regard  to  the  other  locus.  The  genotypic  values  for  a 
single  locus,  expressed  as  deviations  from  the  population  mean,  were  given 
in  Table  7.3.  With  complete  dominance  these  reduce  to  zaq2  for  the  two 
dominant  genotypes  combined,  and  -20(1  -q2)  for  the  recessive  homo- 
zygote. Take  the  B  -  genotype  for  example:  the  value  of  2«B  m  tne  popu- 
lation under  consideration  was  shown  above  to  be  0-57,  and  the  value  of 
q2  assumed  is  0-4;  therefore  the  genotypic  value  is  0-57x0-4=  +0-228- 
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This  is  the  average  value  of  the  B  -  genotype  irrespective  of  the  other  locus. 
The  other  single-locus  values,  found  in  a  similar  way,  are  as  follows: 


B- 

bb 

1 

C- 

cece 

-  0-228 

-0-342 

1     Gc: 

+  o-o6o 

-  0-240 

The  values  given  in  line  (5)  of  the  table  as  GB  +  Gc  are  found  by  summa- 
tion of  the  two  appropriate  single-locus  values.  For  example,  the  B  -  C  - 
genotype  is  +0-228  +  0-060=  +0-288.  These  are  the  genotypic  values 
expected  if  there  were  additive  combination.  It  may  be  noted  that  their 
mean,  obtained  by  summation  of  (value  x  frequency)  is  zero,  as  is  the  mean 
of  the  aggregate  genotypic  values  in  line  (4).  Finally,  the  interaction  devi- 
ations, 7BC,  given  in  line  (6)  are  obtained  by  subtracting  the  "expected" 
values  in  line  (5)  from  the  "actual"  values  in  line  (4).  The  mean  interaction 
deviation  is  also  zero. 


CHAPTER   8 


VARIANCE 


The  genetics  of  a  metric  character  centres  round  the  study  of  its 
variation,  for  it  is  in  terms  of  variation  that  the  primary  genetic 
questions  are  formulated.  The  basic  idea  in  the  study  of  variation  is 
its  partitioning  into  components  attributable  to  different  causes.  The 
relative  magnitude  of  these  components  determines  the  genetic 
properties  of  the  population,  in  particular  the  degree  of  resemblance 
between  relatives.  In  this  chapter  we  shall  consider  the  nature  of 
these  components  and  how  the  genetic  components  depend  on  the 
gene  frequency.  Then,  in  the  next  chapter,  we  shall  show  how  the 
degree  of  resemblance  between  relatives  is  determined  by  the  magni- 
tudes of  the  components. 

The  amount  of  variation  is  measured  and  expressed  as  the  vari- 
ance: when  values  are  expressed  as  deviations  from  the  population 
mean  the  variance  is  simply  the  mean  of  the  squared  values.  The 
components  into  which  the  variance  is  partitioned  are  the  same  as  the 
components  of  value  described  in  the  last  chapter;  so  that,  for 
example,  the  genotypic  variance  is  the  variance  of  genotypic  values, 
and  the  environmental  variance  is  the  variance  of  environmental 
deviations.  The  total  variance  is  the  phenotypic  variance,  or  the 
variance  of  phenotypic  values,  and  is  the  sum  of  the  separate  com- 
ponents. The  components  of  variance  and  the  values  whose  variance 
they  measure  are  listed  in  Table  8.1. 


Table  8 

.1 

Components  of 

Variance 

Value  whose  variance 

Variance  component 

Symbol 

is  measured 

Phenotypic 

vP 

Phenotypic  value 

Genotypic 

Vg 

Genotypic  value 

Additive 

VA 

Breeding  value 

Dominance 

v» 

Dominance  deviation 

Interaction 

Vj 

Interaction  deviation 

Environmental 

VE 

Environmental  deviation 
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The  total  variance  is  then,  with  certain  qualifications  to  be  men- 
tioned below,  the  sum  of  the  components,  thus: 

VP  =  VG+VE 

=  VA  +  VD  +  VI+VE  (8.1) 

Let  us  now  consider  these  components  of  variance  in  detail. 


Genotypic  and  Environmental  Variance 

The  first  division  of  phenotypic  value  that  we  made  in  the  last 
chapter  was  into  genotypic  value  and  environmental  deviation, 
P=G  +E.  The  corresponding  partition  of  the  variance  into  genotypic 
and  environmental  components  formulates  the  problem  of  "heredity 
versus  environment' '  or  "nature  and  nurture";  or,  to  put  the  question 
more  precisely,  the  relative  importance  of  genotype  and- environment 
in  determining  the  phenotypic  value.  The  "relative  importance"  of  a 
cause  of  variation  means  the  amount  of  variation  that  it  gives  rise  to, 
as  a  proportion  of  the  total.  So  the  relative  importance  of  genotype 
as  a  determinant  of  phenotypic  value  is  given  by  the  ratio  of  geno- 
typic to  phenotypic  variance,  VG/VP.  The  genotypic  and  environ- 
mental components  cannot  be  estimated  directly  from  observations 
on  the  population,  but  in  certain  circumstances  they  can  be  estimated 
in  experimental  populations.  If  one  or  other  component  could  be 
completely  eliminated,  the  remaining  phenotypic  variance  would 
provide  an  estimate  of  the  remaining  component.  Environmental 
variance  cannot  be  removed  because  it  includes  by  definition  all 
non-genetic  variance,  and  much  of  this  is  beyond  experimental 
control.  Elimination  of  genotypic  variance  can,  however,  be  achieved 
experimentally.  Highly  inbred  lines,  or  the  Fx  of  a  cross  between  two 
such  lines,  provide  individuals  all  of  identical  genotype  and  therefore 
with  no  genotypic  variance.  If  a  group  of  such  individuals  is  raised 
under  the  normal  range  of  environmental  circumstances,  their  pheno- 
typic variance  provides  an  estimate  of  the  environmental  variance 
V  .  Subtraction  of  this  from  the  phenotypic  variance  of  a  genetically 
mixed  population  then  gives  an  estimate  of  the  genotypic  variance 
of  this  population. 

Example  8.i.    Partitioning  of  the  phenotypic  variance  into  its  geno- 
typic and  environmental  components  has  been  done  for  several  characters 
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in  Drosophila  melanogaster.  The  results  are  given  later,  in  Table  8.2,  but 
here  we  may  describe  the  results  for  one  character  in  more  detail  in  order 
to  show  how  the  partitioning  is  made.  The  character  is  the  length  of  the 
thorax  (in  units  of  i/ioo  mm.),  which  may  be  regarded  as  a  measure  of  body- 
size.  The  phenotypic  variance  was  measured  first  in  a  genetically  mixed — 
i.e.  a  random-bred — population,  and  then  in  a  genetically  uniform  popu- 
lation, consisting  of  the  F±  generation  of  three  crosses  between  highly 
inbred  lines.  The  first  estimates  the  genotypic  and  environmental  variance 
together,  and  the  second  estimates  the  environmental  variance  alone.  So, 
by  subtraction,  an  estimate  of  the  genotypic  variance  is  obtained.  The 
results,  obtained  by  F.  W.  Robertson  ( 19576),  were  as  follows: 


Population 

Components 

Observed  variance 

Mixed 

vG  +  vE 

0-366 

Uniform 

vE 

0-186 

Difference 

vG 

0-180 

Va/Vp    =49% 

Thus  49  per  cent  of  the  variation  of  thorax  length  in  this  genetically  mixed 
population  is  attributable  to  genetic  differences  between  individuals,  and 
5 1  per  cent  to  non-genetic  differences. 

Individuals  of  identical  genotype  are  also  provided  by  identical 
twins  in  man  and  cattle,  but  their  use  in  partitioning  the  variance  is 
very  limited:  they  will  be  discussed  in  a  later  chapter  when  the 
problems  that  they  raise  will  be  better  understood.  Apart  from  the 
severely  limited  use  of  identical  twins,  the  partitioning  of  the  vari- 
ance into  genotypic  and  environmental  components  depends  on  the 
availability  of  highly  inbred  lines,  and  is  therefore  restricted  to  experi- 
mental populations  of  plants  or  small  animals. 

Three  complications  arise  in  connexion  with  the  partitioning  of 
the  variance  into  genotypic  and  environmental  components.  They 
are  all  things  that  can  usually  be  neglected  or  circumvented  with  little 
risk  of  error,  but  in  some  circumstances  they  may  be  important.  The 
following  account  of  them  might  well  be  omitted  at  a  first  reading, 
unless  the  reader  is  worried  by  the  logical  fallacies  introduced  by 
neglecting  them. 

Dependence  of  environmental  variance  on  genotype.  Ex- 
periments of  the  type  illustrated  in  Example  8.1  rest  on  the  assump- 
tion that  the  environmental  variance  is  the  same  in  all  genotypes,  and 
this  is  certainly  not  always  true.  The  environmental  variance  mea- 
sured in  one  inbred  line  or  cross  is  that  shown  by  this  one  particular 
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genotype,  and  other  genotypes  may  be  more  or  less  sensitive  to 
environmental  influences  and  may  therefore  show  more  or  less 
environmental  variance.  The  environmental  variance  of  the  mixed 
population  may  therefore  not  be  the  same  as  that  measured  in  the 
genotypically  uniform  group.  Not  very  much  is  yet  known  about  this 
complication  except  that  many  characters  show  more  environmental 
variance  among  inbred  than  among  outbred  individuals,  inbreds  being 
more  sensitive  or  less  well  "buffered."  The  reality  of  the  complica- 
tion is  therefore  not  in  doubt.  Further  discussion  of  the  phenomenon 
will  be  found  under  the  effects  of  inbreeding,  in  Chapter  15,  where  it 
more  properly  belongs.  The  existence  of  this  complication  means 
that  when  dealing  with  genotypically  mixed  populations  we  have  to 
define  the  environmental  component  of  variance  as  the  mean  en- 
vironmental variance  of  the  genotypes  in  the  population,  and  we  have 
to  recognise  the  possibility  that  if  the  frequencies  of  the  genotypes 
are  changed,  as  by  selection,  the  environmental  variance  may  also  be 
changed  in  consequence. 

Genotype-environment  correlation.  Hitherto  we  have  tacitly 
assumed  that  environmental  deviations  and  genotypic  values  are 
independent  of  each  other;  in  other  words  that  there  is  no  correlation 
between  genotypic  value  and  environmental  deviation,  such  as  would 
arise  if  the  better  genotypes  were  given  better  environments.  Corre- 
lation between  genotype  and  environment  is  seldom  an  important 
complication,  and  can  usually  be  neglected  in  experimental  popula- 
tions, where  randomisation  of  environment  is  one  of  the  chief  objects 
of  experimental  design.  There  are  some  situations,  however,  in 
which  the  correlation  exists.  Milk-yield  in  dairy  cattle  provides  an 
example.  The  normal  practice  of  dairy  husbandry  is  to  feed  cows 
according  to  their  yield,  the  better  phenotypes  being  given  more 
food.  This  introduces  a  correlation  between  phenotypic  value  and 
environmental  deviation;  and,  since  genotypic  and  phenotypic  values 
are  correlated,  there  is  also  a  correlation  between  genotypic  value  and 
environmental  deviation.  The  complication  of  genotype-environment 
correlation  is  very  simply  overcome  by  regarding  the  special  environ- 
ment— i.e.  the  feeding  level  in  the  case  of  cows — as  part  of  the  geno- 
type. This  situation  is  covered  by  the  definition  of  genotypic  value, 
provided  genotypic  values  are  taken  to  refer  to  genotypes  as  they 
occur  under  the  normal  conditions  of  association  with  specific 
environments.  If  genotypic  values  were  not  so  defined  we  could  not 
treat  the  phenotypic  variance  as  simply  the  sum  of  the  genotypic  and 
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environmental  variances,  but  we  should  have  to  include  a  covariance 
term,  thus: 


VP  =  Vq  +  VE  +  2C0V0E 


.(8.2) 


where  covGE  is  the  covariance  of  genotypic  values  and  environmental 
deviations.  If  the  genotypic  variance  is  estimated,  as  in  Example  8.i, 
by  the  comparison  of  genetically  identical  with  genetically  mixed 
groups,  then  the  covariance  would  be  eliminated  with  the  genotypic 
variance  from  the  genetically  identical  group,  and  the  estimate  ob- 
tained will  be  of  genotypic  variance  together  with  twice  the  co- 
variance.  Thus,  while  on  theoretical  grounds  it  is  convenient,  on 
practical  grounds  it  is  unavoidable,  to  regard  any  covariance  that  may 
arise  from  genotype-environment  correlation  as  being  part  of  the 
genotypic  variance. 

Genotype-environment  interaction.  Another  assumption  that 
we  have  made,  which  is  not  always  justifiable,  is  that  a  specific  differ- 
ence of  environment  has  the  same  effect  on  different  genotypes;  or,  in 
other  words,  that  we  can  associate  a  certain  environmental  deviation 
with  a  specific  difference  of  environment,  irrespective  of  the  genotype 
on  which  it  acts.  When  this  is  not  so  there  is  an  interaction,  in  the 
statistical  sense,  between  genotypes  and  environments.  There  are 
several  forms  which  this  interaction  may  take  (Haldane,  1946).  For 
example,  a  specific  difference  of  environment  may  have  a  greater 
effect  on  some  genotypes  than  on  others;  or  there  may  be  a  change  in 
the  order  of  merit  of  a  series  of  genotypes  when  measured  under 
different  environments.  That  is  to  say,  genotype  A  may  be  superior 
to  genotype  B  in  environment  X,  but  inferior  in  environment  Y,  as  in 
the  following  example. 

Example  8.2.  The  following  figures  show  the  growth,  between  3  and 
6  weeks  of  age,  of  two  strains  of  mice  reared  on  two  levels  of  nutrition 
(original  data): 


Good 

Bad 

nutrition 

nutrition 

Strain  A 

17-2  gm. 

12-6  gm. 

Strain  B 

16-6  gm. 

13*3  gm- 

Strain  A  grows  better  than  strain  B  under  good  conditions,  but  worse 
under  bad  conditions. 
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An  interaction  between  genotype  and  environment,  whatever  its 
nature,  gives  rise  to  an  additional  component  of  variance.    This 
interaction  variance  can  be  isolated  and  measured  only  under  rather 
artificial  circumstances.    We  may  replicate  genotypes  by  the  use  of 
inbred  lines  or  Fx's,  and  replicate  specific  environments  by  the  con- 
trol of  such  factors  as  nutrition  or  temperature.   Then  an  analysis  of 
variance  in  a  two-way  classification  of  genotypes  x  environments  will 
yield  estimates  of  the  genotypic  variance  (between  genotypes),  the 
environmental  variance  (between  environments)  and  the  variance 
attributable  to  interaction  of  genotypes  with  environments.    The 
specific  environments  in  such  an  experiment  are,  however,  more  in 
the  nature  of  "treatments"  because  a  population  under  genetical 
study  would  not  normally  encounter  so  wide  a  range  of  environments 
as  that  provided  by  the  different  treatments.    It  is  therefore  the 
genotype-environment  interaction  occurring  within  one  such  treat- 
ment that  is  relevant  to  the  genetical  study  of  a  population,  and  this 
cannot  be  measured  because  the  separate  elements  of  the  environ- 
ment cannot  be  isolated  and  controlled.    In  an  experiment  such  as 
that  of  Example  8.1,  which  removes  the  genotypic  variance  by  the 
use  of  inbred  lines  or  F^s,  the  interaction  variance  remains  with  the 
environmental  in  the  phenotypic  variance  measured  in  the  genetically 
uniform  individuals.    In  normal  circumstances,  therefore,  the  vari- 
ance due  to  genotype-environment  interaction,  since  it  cannot  be 
separately  measured,  is  best  regarded  as  part  of  the  environmental 
variance.    When  large  differences  of  environment,  such  as  between 
different  habitats,  are  under  consideration,  the  presence  of  genotype- 
environment  interaction  becomes  important  in  connexion  with  the 
specialisation  of  breeds  or  varieties  to  local  conditions.   This  matter 
will  be  taken  up  again  later,  in  Chapter  19,  because  it  can  be  more 
profitably  discussed  from  a  different  viewpoint. 


Genetic  Components  of  Variance 

The  partition  into  genotypic  and  environmental  variance  does  not 
take  us  far  toward  an  understanding  of  the  genetic  properties  of  a 
population,  and  in  particular  it  does  not  reveal  the  cause  of  resem- 
blance between  relatives.  The  genotypic  variance  must  be  further 
divided  according  to  the  division  of  genotypic  value  into  breeding 
value,  dominance  deviation,  and  interaction  deviation.  Thus  we  have: 
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Values 

Variance  components 


G       =    A       +      D         +        I 

VG     =     VA     +      VD       +       Vt         (8.4) 

(genotypic)      (additive)      (dominance)      (interaction) 

The  additive  variance,  which  is  the  variance  of  breeding  values,  is  the 
important  component  since  it  is  the  chief  cause  of  resemblance  be- 
tween relatives  and  therefore  the  chief  determinant  of  the  observable 
genetic  properties  of  the  population  and  of  the  response  of  the  popu- 
lation to  selection.  Moreover,  it  is  the  only  component  that  can  be 
readily  estimated  from  observations  made  on  the  population.  In  prac- 
tice, therefore,  the  important  partition  is  into  additive  genetic  variance 
versus  all  the  rest,  the  rest  being  non-additive  genetic  and  environ- 
mental variance.  This  partitioning  is  most  conveniently  expressed 
as  the  ratio  of  additive  genetic  to  total  phenotypic  variance,  VA/VP>  a 
ratio  called  the  heritability. 

Estimation  of  the  additive  variance  rests  on  observation  of  the 
degree  of  resemblance  between  relatives  and  will  be  described  later 
when  we  have  discussed  the  causes  of  resemblance  between  relatives. 
Our  immediate  concern  here  is  to  show  how  the  genetic  components 
of  variance  are  influenced  by  the  gene  frequency.  To  do  this  we  have 
to  express  the  variance  in  terms  of  the  gene  frequency  and  the  as- 
signed genotypic  values  a  and  d.  We  shall  consider  first  a  single  locus 
with  two  alleles,  thus  excluding  interaction  variance  for  the  moment. 

Additive  and  dominance  variance.  The  information  needed  to 
obtain  expressions  for  the  variance  of  breeding  values  and  the  variance 
of  dominance  deviations  was  given  in  the  last  chapter  in  Table  7.3 
(p.  125).  This  table  gives  the  breeding  values  and  dominance  devia- 
tions of  the  three  genotypes,  expressed  as  deviations  from  the  popu- 
lation mean.  It  will  be  remembered  that  the  means  of  both  breeding 
values  and  dominance  deviations  are  zero.  Therefore  no  correction 
for  an  assumed  mean  is  needed,  and  the  variance  is  simply  the  mean 
of  the  squared  values.  The  variances  are  thus  obtained  by  squaring 
the  values  in  the  table,  multiplying  by  the  frequency  of  the  genotype 
concerned,  and  summing  over  the  three  genotypes.  (The  procedure 
of  multiplying  values  by  frequencies  to  obtain  the  mean  was  explained 
on  p.  114.)  The  additive  variance,  which  is  the  variance  of  breeding 
values,  is  obtained  as  follows: 


VA  =  oPfoqy  +  (q  -pf .  zpq  +  tf  y] 
=  2pqoc2(2pq  +p2  +q2-  zpq  +  zpq) 
=  zpqa? 
=  zpq[a  +  d(q-p)Y 


.(8.5.  a) 
\8.5.b) 

F.Q.G. 


I 
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Similarly  the  variance  of  dominance  deviations  is 
VD  =  d%iq2p2  +  Sp  V  +  ^p  V) 

=  (^)2  (5.6) 

It  was  noted  in  the  last  chapter  that  breeding  values  and  dominance 
deviations  are  uncorrelated.  From  this  it  follows  that  the  genotypic 
variance  is  simply  the  sum  of  the  additive  and  dominance  variances. 
Thus 

vQ=vA  +  vD 

=  zpq[a  +  d(q  -p)f  +  [zpqdf  (8.7) 

Example  8.3.  To  illustrate  the  genetic  components  of  variance  arising 
from  a  single  locus  let  us  return  to  the  pygmy  gene  in  mice,  used  for 
several  examples  in  the  last  chapter.  From  the  values  tabulated  in  Ex- 
ample 7.6  (p.  123)  we  may  compute  the  components  of  variance  directly. 
Since  the  values  are  expressed  as  deviations  from  the  population  mean,  the 
variance  is  obtained  by  multiplying  the  frequency  of  each  genotype  by 
the  square  of  its  value,  and  summing  over  the  three  genotypes.  For  ex- 
ample, the  genotypic  variance  when  q  =  o-i  is  o-8i(o-44)2  +  o-i8(  -  i*56)2  + 
o-oi(-7'56)2  =  1-1664.  The  additive  variance  is  obtained  in  the  same  way 
from  the  variance  of  breeding  values,  and  the  dominance  variance  from 
the  variance  of  dominance  deviations.  The  variances  obtained  are  as 
follows: 

q  =  o-i  <Z  =  o*4 

Genotypic,  Vq  1-1664         7,:i:424 

Additive,  VA  1-0368  6-2208 

Dominance,  Vj)        0-1296  0-9216 

The  variances  may  be  obtained  also,  and  with  less  trouble,  by  use  of  the 
formulae  given  above  in  equations  #.5,  8.6  and  8.J.  The  values  to  be  sub- 
stituted were  given  in  Example  7.1;  namely,  a  =  4  and  d  =  z.  Notice  that 
the  dominance  variance  is  quite  small  in  comparison  with  the  additive. 

The  ways  in  which  the  gene  frequency  and  the  degree  of  domin- 
ance influence  the  magnitude  of  the  genetic  components  of 
variance  can  best  be  appreciated  from  graphical  representations  of  the 
relationships  derived  above,  in  equations  £.5,  8.6,  and  8.J.  The 
graphs  in  Fig.  8.1  show  the  amounts  of  genotypic,  additive,  and 
dominance  variance  arising  from  a  single  locus  with  two  alleles, 
plotted  against  the  gene  frequency.   Three  cases  are  shown  to  illus- 
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trate  the  effect  of  different  degrees  of  dominance:  in  graph  (a)  there 
is  no  dominance  (d=o);  in  graph  (b)  there  is  complete  dominance 
(d=a);  and  in  graph  (c)  there  is  "pure"  over- dominance  (a  =  o). 
In  the  first  case  the  genotypic  variance  is  all  additive,  and  it  is 
greatest  whenp=q=o-$.  In  the  second  case  the  dominance  variance 
is  maximal  when  p=q  =  0-5,  and  the  additive  is  maximal  when  the 
frequency  of  the  recessive  allele  is  q  =  o-j$.  In  the  third  case  the 
dominance  variance  is  the  same  as  in  the  second  and  is  maximal 
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Fig.  8.1.  Magnitude  of  the  genetic  components  of  variance 
arising  from  a  single  locus  with  two  alleles,  in  relation  to  the  gene 
frequency.  Genotypic  variance — thick  lines;  additive  variance — 
thin  lines;  dominance  variance — broken  lines.  The  gene  fre- 
quency, q,  is  that  of  the  recessive  allele.  The  degrees  of  dominance 
are:  in  (a)  no  dominance  (d=o);  in  (b)  complete  dominance  (d  =  a); 
and  in  (c)  "pure"  overdominance  (a  =o).  The  figures  on  the  vertical 
scale,  showing  the  amount  of  variance,  are  to  be  multiplied  by  a2 
in  graphs  (a)  and  (b),  and  by  d2  in  graph  (c). 


00 


when  p=q=o-$.  The  additive  variance,  however,  is  zero  when 
p=q  =  o-$y  and  has  two  maxima,  one  at  ^  =  0-15  and  the  other  at 
^  =  0-85.  The  genotypic  variance,  in  this  case,  remains  practically 
constant  over  a  wide  range  of  gene  frequency,  though  its  composition 
changes  profoundly.  The  general  conclusion  to  be  drawn  from  these 
graphs  is  that  genes  contribute  much  more  variance  when  at  inter- 
mediate frequencies  than  when  at  high  or  low  frequencies:  recessives 
at  low  frequency,  in  particular,  contribute  very  little  variance. 

A  possible  misunderstanding  about  the  concept  of  additive  gene- 
tic variance,  to  which  the  terminology  may  give  rise,  should  be 
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mentioned  here.  The  concept  of  additive  variance  does  not  carry 
with  it  the  assumption  of  additive  gene  action;  and  the  existence  of 
additive  variance  is  not  an  indication  that  any  of  the  genes  act  addi- 
tively  (i.e.  show  neither  dominance  nor  epistasis).  No  assumption  is 
made  about  the  mode  of  action  of  the  genes  concerned.  Additive 
variance  can  arise  from  genes  with  any  degree  of  dominance  or  epis- 
tasis, and  only  if  we  find  that  all  the  genotypic  variance  is  additive  can 
we  conclude  that  the  genes  show  neither  dominance  nor  epistasis. 

The  existence  of  more  than  two  alleles  at  a  locus  introduces  no 
new  principle,  though  it  complicates  the  theoretical  description  of  the 
effect  of  the  locus.  Expressions  for  the  additive  and  dominance 
variances  are  given  by  Kempthorne  (1955a).  The  locus  contributes 
additive  variance  arising  from  the  average  effects  of  its  several  alleles, 
and  dominance  variance  arising  from  the  several  dominance  devia- 
tions. 

To  arrive  at  the  variance  components  expressed  in  the  population 
the  separate  effects  of  all  loci  that  contribute  variance  have  to  be 
combined.  The  additive  variance  arising  from  all  loci  together  is  the 
sum  of  the  additive  variances  attributable  to  each  locus  separately; 
and  the  dominance  variance  is  similarly  the  sum  of  the  separate  contri- 
butions. But  when  more  than  one  locus  is  under  consideration  then 
the  interaction  deviations,  if  present,  give  rise  to  another  component 
of  variance,  the  interaction  variance,  which  is  the  variance  of  the 
interaction  deviations. 

Interaction  variance.  We  shall  treat  the  interaction  variance  as 
a  complication,  like  genotype-environment  correlation  or  inter- 
action, to  be  circumvented:  that  is  to  say,  we  shall  not  discuss  its 
properties  in  detail,  but  we  shall  show  what  happens  to  it  if  it  is 
ignored.  It  is  only  comparatively  recently  that  the  properties  of  the 
interaction  variance  have  been  worked  out  (see  Cockerham,  1954; 
Kempthorne,  1954,  1955a,  6)  and  little  is  yet  known  about  its  import- 
ance in  relation  to  the  other  components.  It  seems  probable,  how- 
ever, that  the  amount  of  variance  contributed  by  it  is  usually  rather 
small,  and  that  neglect  of  it  is  therefore  not  likely  to  lead  to  serious 
error.  Description  of  the  properties  of  interaction  variance  rests  on 
its  further  subdivision  into  components.  It  is  first  subdivided  ac- 
cording to  the  number  of  loci  involved:  two-factor  interaction  arises 
from  the  interaction  of  two  loci,  three-factor  from  three  loci,  etc. 
Interactions  involving  larger  numbers  of  loci  contribute  so  little 
variance  that  they  can  be  ignored,  and  we  shall  confine  our  attention 
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to  two-factor  interactions  since  these  suffice  to  illustrate  the  principles 
involved.  The  next  subdivision  of  the  interaction  variance  is  accord- 
ing to  whether  the  interaction  involves  breeding  values  or  dominance 
deviations.  There  are  thus  three  sorts  of  two-factor  interactions. 
Interaction  between  the  two  breeding  values  gives  rise  to  additive  x 
additive  variance,  V AA\  interaction  between  the  breeding  value  of  one 
locus  and  the  dominance  deviation  of  the  other  gives  rise  to  additive  x 
dominance  variance,  VAD;  and  interaction  between  the  two  domin- 
ance deviations  gives  rise  to  dominance  x  dominance  variance,  VDD. 
So  the  interaction  variance  is  broken  down  into  components  thus: 


Vi  =  VAA  +  VAD  +  VDD  +  etc. 


,(8.8) 


the  terms  designated  "etc."  being  similar  components  arising  from 
interactions  between  more  than  two  loci.  At  the  moment  we  cannot 
go  further  than  this  in  the  description  of  the  interaction  variance,  but 
we  shall  show  later  how  it  affects  the  resemblance  between  relatives 
and  what  happens  to  it  when  components  of  variance  are  estimated 
from  observations  on  the  population. 

That  completes  the  description  of  the  nature  of  the  genetic  com- 
ponents of  variance.  The  practical  value  of  the  partitioning  of  the 
variance  will  not  yet  be  fully  apparent  because  it  arises  from  the 
causes  of  resemblance  between  relatives,  which  is  the  subject  of  the 
next  chapter.  The  partitioning  we  have  made  is  essentially  a  theo- 
retical one,  and  before  we  pass  on  we  should  consider  how  much  of  it 
can  actually  be  made  in  practice.  When  observations  of  resemblance 
between  relatives  are  available  we  can  estimate  the  additive  variance 
and  so  make  the  partition  V A  :  (VD+  Vr+  VE).  And  if  inbred  lines 
are  available  we  can  estimate  the  environmental  variance  and  so  make 
the  partition  VG  :  VE.  If  both  these  partitions  are  made  we  can 
separate  the  additive  genetic  from  the  rest  of  the  genetic  variance,  and 
so  make  the  three-fold  partition  into  additive  genetic,  non-additive 
genetic,  and  environmental  variance,  VA  :  (VD  +  Vj)  :  V^,  the  domin- 
ance and  interaction  components  being  lumped  together  as  non- 
additive  genetic  variance.  Examples  of  this  partitioning  are  given  in 
Table  8.2,  although  at  this  stage  the  method  by  which  the  additive 
component  is  estimated  will  not  be  understood.  This  partitioning  is 
as  far  as  we  can  go  by  means  of  relatively  simple  experiments.  By 
more  elaborate  techniques,  requiring  large  numbers  of  observations, 
it  may  be  possible  to  go  some  way  toward  separating  the  dominance 
from  the  interaction  components,  or  at  least  to  get  an  idea  of  their 
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relative  importance.    (See,  in  particular,  Robinson  and  Comstock, 
1955;  Hayman,  1955,  1958;  Cockerham,  19566.) 

Table  8.2 

Partitioning  of  the  variance  of  four  characters  in  Drosophila 

melanogaster.     Components   as  percentages  of  the  total, 

phenotypic,  variance. 

Character 


(I) 

(2) 

(3) 

(4) 

Bristles 

Thorax 

Ovary 

Egg* 

Phenotypic 

vP 

100 

100 

100 

100 

Additive  genetic 

VA 

52 

43 

30 

18 

Non-additive  genetic 

Vd  +  Vj 

9 

6 

40 

44 

Environmental 

vE 

39 

5i 

30 

38 

Characters: 

(1)  Number  of  bristles  on  4th  +  5th  abdominal  segments  (Clayton,  Morris, 
and  Robertson,  1957;  Reeve  and  Robertson,  1954). 

(2)  Length  of  thorax  (F.  W.  Robertson,  1 9576). 

(3)  Size  of  ovaries,  i.e.  number  of  ovarioles  in  both  ovaries.    (F.  W. 
Robertson,  19570). 

(4)  Number  of  eggs  laid  in  4  days  (4th  to  8th  after  emergence)  (F.  W. 
Robertson,  19576). 


Environmental  Variance 

Environmental  variance,  which  by  definition  embraces  all  varia- 
tion of  non-genetic  origin,  can  have  a  great  variety  of  causes  and  its 
nature  depends  very  much  on  the  character  and  the  organism  studied. 
Generally  speaking,  environmental  variance  is  a  source  of  error  that 
reduces  precision  in  genetical  studies  and  the  aim  of  the  experimenter 
or  breeder  is  therefore  to  reduce  it  as  much  as  possible  by  careful 
management  or  proper  design  of  experiments.  {Nutritional  and 
climatic  factors  are  the  commonest  external  causes  of  environmental 
variation,  and  they  are  at  least  partly  under  experimental  control. 
Maternal  effects  form  another  source  of  environmental  variation  that 
is  sometimes  important,  particularly  in  mammals,  but  is  less  sus- 
ceptible to  control.  Maternal  effects  are  prenatal  and  postnatal 
influences,  mainly  nutritional,  of  the  mother  on  her  young:  we  shall 
have  more  to  say  about  them  in  the  next  chapter  in  connexion  with 
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resemblance  between  relatives.  Error  of  measurement  is  another 
source  of  variation,  though  it  is  usually  quite  trivial.  When  a  charac- 
ter can  be  measured  in  units  of  length  or  weight  it  is  usually  measured 
so  accurately  that  the  variance  attributable  to  measurement  is  neg- 
ligible in  comparison  with  the  rest  of  the  variance.  Some  characters, 
however,  cannot  strictly  speaking  be  measured,  but  have  to  be  graded 
by  judgement  into  classes.  Carcass  qualities  of  livestock  are  an  ex- 
ample. With  such  characters  the  variance  due  to  measurement  may 
be  considerable. 

In  addition  to  the  variation  arising  from  recognisable  causes,  such 
as  those  mentioned,  there  is  usually  also  a  substantial  amount  of 
non-genetic  variation  whose  cause  is  unknown,  and  which  therefore 
cannot  be  eliminated  by  experimental  design.  This  is  generally 
referred  to  as  "intangible"  variation.  Some  of  the  intangible  varia- 
tion may  be  caused  by  "environmental"  circumstances,  in  the  common 
meaning  of  the  word — that  is,  by  circumstances  external  to  the 
individual — even  though  their  nature  is  not  known.  Some,  however, 
may  arise  from  "developmental"  variation:  variation,  that  is,  which 
cannot  be  attributed  to  external  circumstances,  but  is  attributed,  in 
ignorance  of  its  exact  nature,  to  "accidents"  or  "errors"  of  develop- 
ment as  a  general  cause.  Characters  whose  intangible  variation  is 
predominantly  developmental  are  those  connected  with  anatomical 
structure,  which  do  not  change  after  development  is  complete,  such 
as  skeletal  form,  pigmentation,  or  bristle  number  in  Drosophila. 
Characters  more  susceptible  to  the  influences  of  the  external  environ- 
ment, in  contrast,  are  those  connected  with  metabolic  processes,  such 
as  growth,  fertility,  and  lactation. 


Example  8.4.  Human  birth  weight  provides  an  example  of  a  character 
subject  to  much  environmental  variation  whose  nature  has  been  analysed 
in  detail  (Penrose,  1954;  Robson,  1955).  The  partitioning  of  the  pheno- 
typic  variance  given  in  the  table  shows  the  relative  importance  of  all  the 
identified  sources  of  variation,  birth  weight  being  regarded  as  a  character 
of  the  child.  All  the  environmental  variation  is  "maternal"  in  the  sense 
that  it  is  connected  with  the  prenatal  environment,  but  several  distinct 
components  of  the  maternal  environment  are  distinguished.  "Maternal 
genotype,"  which  accounts  for  20  per  cent  of  the  total  phenotypic  variance, 
reflects  genetic  variation  (chiefly  additive)  between  mothers  in  the  birth 
weight  of  their  children;  i.e.  birth  weight  regarded  as  a  character  of  the 
mother.  "Maternal  environment,  general,"  which  accounts  for  another 
18  per  cent,  reflects  non-genetic  variation  between  mothers  in  the  same  way. 
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These  two  components,  totalling  38  per  cent,  are  maternal  causes  of  varia- 
tion in  birth  weight  that  affect  all  children  of  the  same  mother  alike. 
"Maternal  environment,  immediate"  means  causes  attributable  to  the 
mother  but  differing  in  successive  pregnancies.  Two  causes  of  the  same 
nature — "age  of  mother"  and  "parity"  (i.e.  whether  the  child  is  the  first, 

Partitioning  of  variance  of  human  birth-weight.    Com- 
ponents as  percentages  of  the  total,  phenotypic,  variance. 


Cause  of  variation 

%oft 

Genetic 

Additive 

15 

Non-additive  (approx) 

1 

Sex 

2 

Total  genotypic 

Environmental 

Maternal  genotype 

20 

Maternal  environment, 

general 

18 

Maternal  environment, 

immediate 

6 

Age  of  mother 

1 

Parity 

7 

Intangible 

3° 

Total  environmental 

18 


82 

second,  etc.) — are  separately  identifiable.  Finally,  the  "intangible" 
variation  is  all  the  remainder,  of  which  the  cause  cannot  be  identified.  To 
explain  how  these  various  components  were  estimated  would  take  too 
much  space,  and  could  not  properly  be  done  until  the  end  of  Chapter  10. 
It  must  suffice  to  say  that  the  estimates  all  come  from  comparisons  of  the 
degree  of  resemblance  between  identical  twins,  fraternal  twins,  full  sibs, 
children  of  sisters,  and  other  sorts  of  cousins. 

Multiple  measurements.  When  more  than  one  measurement 
of  the  character  can  be  made  on  each  individual,  the  phenotypic 
variance  can  be  partitioned  into  variance  within  individuals  and 
variance  between  individuals.  This  subdivision  serves  to  show  how 
much  is  to  be  gained  by  the  repetition  of  measurements,  and  it  may 
also  throw  light  on  the  nature  of  the  environmental  variation.  There 
are  two  ways  by  which  the  repetition  of  a  character  may  provide 
multiple  measurements:  by  temporal  repetition  and  by  spatial  repe- 
tition. Milk-yield  and  litter  size  are  examples  of  characters  repeated 
in  time.    Milk-yield  can  be  measured  in  successive  lactations,  and 
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litter  size  in  successive  pregnancies.  Several  measurements  of  each  in- 
dividual can  thus  be  obtained.  The  variance  of  yield  per  lactation,  or 
of  the  number  of  young  per  litter,  can  then  be  analysed  into  a  com- 
ponent within  individuals,  measuring  the  differences  between  the 
performances  of  the  same  individual,  and  a  component  between  in- 
dividuals, measuring  the  permanent  differences  between  individuals. 
The  within-individual  component  is  entirely  environmental  in 
origin,  caused  by  temporary  differences  of  environment  between  suc- 
cessive performances.  The  between-individual  component  is  partly 
environmental  and  partly  genetic,  the  environmental  part  being 
caused  by  circumstances  that  affect  the  individuals  permanently.  By 
this  analysis,  therefore,  the  variance  due  to  temporary  environmental 
circumstances  is  separated  from  the  rest,  and  can  be  measured. 

Characters  repeated  in  space  are  chiefly  structural  or  anatomical, 
and  are  found  more  often  in  plants  than  in  animals.  For  example, 
plants  that  bear  more  than  one  fruit  yield  more  than  one  measure- 
ment of  any  character  of  the  fruit,  such  as  its  shape  or  seed  content. 
Spatial  repetition  in  animals  is  chiefly  found  in  characters  that  can  be 
measured  on  the  two  sides  of  the  body  or  on  serially  repeated  parts, 
such  as  the  number  of  bristles  on  the  abdominal  segments  of  Droso- 
phila.  With  spatially  repeated  characters  the  within-individual 
variance  is  again  entirely  environmental  in  origin  but,  unlike  that  of 
temporally  repeated  characters,  it  represents  the  "developmental" 
variation  arising  from  localised  circumstances  operating  during 
development. 

In  order  that  we  may  discuss  both  temporal  and  spatial  repetition 
together  we  shall  use  the  term  special  environmental  variance,  VEs,  to 
refer  to  the  within-individual  variance  arising  from  temporary  or 
localised  circumstances;  and  the  term  general  environmental  variance, 
VEg,  to  refer  to  the  environmental  variance  contributing  to  the 
between-individual  component  and  arising  from  permanent  or  non- 
localised  circumstances.  The  ratio  of  the  between-individual  com- 
ponent to  the  total  phenotypic  variance  measures  the  correlation  (r) 
between  repeated  measurements  of  the  same  individual,  and  is 
known  as  the  repeatability  of  the  character.  The  partitioning  of  the 
phenotypic  variance  expressed  by  the  repeatability  is  thus  into  two 
components,  VEs  versus  (VG  +  VEg),  so  that  the  repeatability  is 


r  = 


Vq+Ve, 
Vp 


.(8.9) 
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The  repeatability  therefore  expresses  the  proportion  of  the  variance 
of  single  measurements  that  is  due  to  permanent,  or  non-localised, 
differences  between  individuals,  both  genetic  and  environmental. 
The  repeatability  differs  very  much  according  to  the  nature  of  the 
character,  and  also,  of  course,  according  to  the  genetic  properties  of 

Table  8.3 
Some  Examples  of  Repeatability 


Organism  and  character 

Repeatability 

Drosophila  melanogaster : 

Abdominal  bristle  number  (see  Example  8.6). 
Ovary  size  (F.  W.  Robertson,  19570). 

•42 

73 

Mouse: 

Weight  at  6  weeks  (repeated  on  4  consecutive  days. 

Original  data). 
Litter  size  (see  Example  8.5). 

•95 

•45 

Sheep: 

Weight  of  fleece,  measured  in  different  years  (Morley, 
I95i)- 

74 

Cattle: 

Milk-yield  (Johansson,  1950).  -40 

the  population  and  the  environmental  conditions  under  which  the 
individuals  are  kept.  The  estimates  in  Table  8.3  give  some  idea  of 
the  sort  of  values  that  may  be  found  with  various  characters,  and  two 
cases  are  described  in  more  detail  in  the  following  examples. 

Example  8.5.  Litter  size  in  mice  will  serve  as  an  example  of  a  character 
repeated  in  time.  The  number  of  live  young  born  in  first  and  in  second 
litters  was  recorded  in  296  mice  of  a  genetically  heterogeneous  stock,  and 
yielded  the  following  components  of  variance  (original  data): 

Between  mice         3-58 
Within  mice  4-44 

(The  procedure  for  estimating  the  components  of  variance  from  an 
analysis  of  variance  is  described  by  Snedecor  (1956,  Section  10.12)  and  is 
outlined  below,  in  Chapter  10,  p.  173.)  The  repeatability  of  litter  size  is 
given  by  the  ratio  of  the  between-mice  component  to  the  sum  of  the  be- 
tween-mice  and  the  within-mice  components:  i.e. 
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3-58 
3-58+4-44 


0*45 


Example  8.6.  The  number  of  bristles  on  the  ventral  surfaces  of  the 
abdominal  segments  is  a  character  that  has  been  much  studied  in  Droso- 
phila  melanogaster,  because  it  is  technically  convenient  and  its  genetic 
properties  are  relatively  simple.  We  have  already  mentioned  it  several 
times  but  have  not  yet  used  it  as  an  example.  There  are  about  20  bristles 
on  each  of  3  segments  in  males  and  each  of  4  segments  in  females.  The 
number  of  bristles  per  segment  can  therefore  be  treated  as  a  spatially 
repeated  character.  The  sources  of  variation  in  this  character  have  been 
studied  in  detail  by  Reeve  and  Robertson  (1954),  and  the  following  com- 
ponents of  variance  were  found: 


es 

?? 

Total  phenotypic 

vP 

4-24 

5'44 

Between  flies 

vG+vEg 

1-82 

2-19 

Within  flies 

VEs 

2-42 

3-25 

Repeatability 

0-429 

0-403 

Estimation  of  the  repeatability  of  a  character  separates  off  the 
component  of  variance  due  to  special  environment,  VEsi  but  it  leaves 
the  other  component  of  environmental  variance — that  due  to  general 
environment,  VEg — confounded  with  the  genotypic  variance,  as 
shown  in  the  above  example.  The  component  due  to  general  en- 
vironment can  be  separately  estimated  only  if  the  genotypic  variance 
(i.e.  including  the  non-additive  components)  has  been  estimated,  in 
the  manner  explained  in  Example  8.1.  This  has  been  done  with  two 
characters  in  Drosophila,  and  the  results  are  given  in  Table  8.4.   The 

Table  8.4 

Partitioning  of  the  environmental  variance  of  two  charac- 
ters in  Drosophila  melanogaster  into  components  due  to 
general,  VEg,  and  special,  VEsy  environment.  The  charac- 
ters are:  abdominal  bristle-number  (Reeve  and  Robertson, 
1954)  as  explained  in  Example  8.6,  and  ovary  size  (F.  W. 
Robertson,  1957a),  measured  in  the  two  ovaries  by  the 
number  of  ovarioles,  or  "egg  strings." 


Total  environmental,  VE 
General  environmental,  VEg 
Special  environmental,  VEs 


Bristle 

Ovar 

number 

size 

100 

100 

3 

9 

97 

91 
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nature  of  the  environmental  variation  revealed  by  these  results  is 
remarkable.  With  both  characters  less  than  10  per  cent  of  the 
environmental  variance  is  general — that  is,  due  to  causes  influencing 
the  individual  as  a  whole.  These  characters  are  therefore  very  little 
influenced  by  the  conditions  of  the  external  environment:  or,  perhaps 
it  would  be  more  accurate  to  say  that  the  experimental  technique  of 
rearing  the  flies  has  been  very  successful  in  eliminating  unwanted 
sources  of  environmental  variation.  Yet,  fully  half  the  phenotypic 
variation  of  one  measurement  (one  segment  or  ovary)  is  non-genetic, 
or  environmental  in  the  wide  sense,  as  shown  in  Table  8.2;  and, 
moreover,  is  due  to  strictly  localised  causes  that  influence  the  seg- 
ments or  ovaries  independently.  Whether  this  developmental 
variation  represents  a  real  indeterminacy  of  development,  or  has 
material  causes  still  undetected  but  in  principle  controllable,  is  quite 
unknown.  Nor  is  it  known  whether  the  situation  revealed  in  these 
two  characters  is  at  all  general.  We  cannot  here  pursue  further  the 
biological  nature  of  the  non-genetic  variation:  a  general  discussion  of 
these  problems  will  be  found  in  Waddington  (1957). 

We  must  return  to  the  repeatability  and  consider  its  uses. 
Knowledge  of  the  repeatability  of  a  character  is  useful  in  two  ways. 
First,  it  sets  upper  limits  to  the  values  of  the  two  ratios,  VAjVP  and 
V0/VP.  The  first  (additive  genetic  to  total  phenotypic  variance),  is  the 
heritability,  which  as  we  shall  see  in  later  chapters  is  of  great  practical 
importance.  The  second  (genotypic  to  phenotypic  variance)  measures 
the  degree  of  genetic  determination  of  the  character.  The  repeatability 
is  usually  much  easier  to  determine  than  either  of  these  two  ratios, 
and  it  may  often  be  known  when  they  are  not. 

The  second  way  in  which  knowledge  of  the  repeatability  is  useful 
is  that  it  indicates  the  gain  in  accuracy  to  be  expected  from  multiple 
measurements.  Suppose  that  each  individual  is  measured  n  times, 
and  that  the  mean  of  these  n  measurements  is  taken  to  be  the  pheno- 
typic value  of  the  individual,  say  P(n).  Then  the  phenotypic  variance 
is  made  up  of  the  genotypic  variance,  the  general  environmental 
variance,  and  one  nth  of  the  special  environmental  variance: 

VPin)  =  Va+VEg  +  ^VEs  (8  jo) 

Thus,  increasing  the  number  of  measurements  reduces  the  amount 
of  variance  due  to  special  environment  that  appears  in  the  pheno- 
typic variance,  and  this  reduction  of  the  phenotypic  variance  repre- 
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sents  the  gain  in  accuracy.  The  variance  of  the  mean  of  n  measure- 
ments as  a  proportion  of  the  variance  of  one  measurement  can  be 
expressed  in  terms  of  the  repeatability,  as  follows: 


Pin) 


i  +  r(n  -  i ) 


(8.U) 


where  r  is  the  repeatability,  or  the  correlation  between  the  measure- 
ments of  the  same  individual.  Fig.  8.2  shows  how  the  phenotypic 
variance  is  reduced  by  multiple  measurements,  with  characters  of 
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NUMBER  OF  MEASUREMENTS 

Fig.  8.2.  Gain  in  accuracy  from  multiple  measurements  of  each 
individual.  The  vertical  scale  gives  the  variance  of  the  mean  of  n 
measurements  as  a  percentage  of  the  variance  of  one  measurement. 
The  horizontal  scale  gives  the  number  of  measurements,  up  to  io. 
The  four  graphs  refer  to  characters  of  different  repeatability  as 
indicated. 
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different  repeatabilities.  When  the  repeatability  is  high,  and  there  is 
therefore  little  special  environmental  variance,  multiple  measure- 
ments give  little  gain  in  accuracy.  When  the  repeatability  is  low, 
multiple  measurements  may  lead  to  a  worth-while  gain  in  accuracy. 
The  gain  in  accuracy,  however,  falls  off  rapidly  as  the  number  of 
measurements  increases,  and  it  is  seldom  worth  while  to  make  more 
than  two  measurements. 

Example  8.7.  Studies  of  abdominal  bristle  number  in  Drosophila  are 
generally  based  on  two  measurements,  i.e.  of  the  fourth  and  fifth  seg- 
ments, and  the  phenotypic  values  are  expressed  as  the  sum  of  the  two 
counts.  As  an  illustration  of  the  nature  of  the  advantage  gained  by  the 
double  measurement  we  may  compare  the  percentage  composition  of  the 
phenotypic  variance  when  phenotypic  values  are  based  on  counts  of  one 
or  of  two  segments: 


One 

segment 

Two 

segments 

Phenotypic 

vP 

100 

100 

Additive  genetic 

vA 

34 

52 

Non-additive  genetic 

Vb  +  Vj 

6 

9 

Environmental,  general 

Vm. 

2 

4 

Environmental,  special 

Vm. 

58 

35 

By  reducing  the  amount  of  environmental  variance,  the  making  of  two 
measurements  increases  the  proportionate  amount  of  genetic  variance:  in 
practice  it  is  the  increase  of  the  proportion  of  additive  variance — in  this 
case  from  34  per  cent  to  52  per  cent — that  is  the  important  consideration. 

There  is  an  important  assumption  implicit  in  the  idea  of  repeata- 
bility, which  we  have  not  yet  mentioned.  It  is  the  assumption  that 
the  multiple  measurements  are  indeed  measurements  of  what  is 
genetically  the  same  character.  Consider  for  example  milk-yield  in 
successive  lactations.  If  the  assumption  were  valid  it  would  mean  that 
the  genes  that  influence  yield  in  first  lactations  are  entirely  the  same 
as  those  that  influence  yield  in  second  or  later  lactations;  or,  to  put  the 
matter  in  another  way,  that  yield  in  all  lactations  is  dependent  on 
identical  developmental  and  physiological  processes.  If  this  assump- 
tion is  not  valid,  as  it  certainly  is  not  for  milk-yield  in  cattle,  then  the 
variation  within  individuals  is  not  purely  environmental,  and  equation 
8.11  is  erroneous.  The  variance  between  the  means  of  individuals 
will  be  augmented  by  additional  variance  arising  from  what  may 
formally  be  regarded  as  interaction  between  genotype  and  "environ- 
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ment,"  that  is  between  genotype  and  the  time  or  location  of  the 
measurement.  And  this  additional  variance  may  be  enough  to 
counteract  the  reduction  of  environmental  variance  which  we  have 
described  as  the  chief  advantage  to  be  gained  from  multiple  measure- 
ments. Consequently  an  increase  in  the  proportion  of  additive  genetic 
variance  from  multiple  measurements  cannot  be  relied  on  until  the 
genetical  identity  of  the  character  measured  has  been  established. 
The  number  of  bristles  on  the  abdominal  segments  of  Drosophila  has 
been  proved  to  be  genetically  the  same  character,  as  will  be  explained 
in  Chapter  19,  and  the  conclusions  reached  in  Example  8.7  are  valid. 
Milk-yield  in  cattle,  in  contrast,  is  not  the  same  character  in  suc- 
cessive lactations,  and  the  proportion  of  additive  variance  is  actually 
less  for  the  mean  of  several  lactations  than  for  first  lactations  only. 
(See  Rendel,  et  al.y  1957.) 


CHAPTER   9 

RESEMBLANCE  BETWEEN  RELATIVES 

The  resemblance  between  relatives  is  one  of  the  basic  genetic  pheno- 
mena displayed  by  metric  characters,  and  the  degree  of  resemblance 
is  a  property  of  the  character  that  can  be  determined  by  relatively 
simple  measurements  made  on  the  population  without  special  experi- 
mental techniques.  The  degree  of  resemblance  provides  the  means 
of  estimating  the  amount  of  additive  variance,  and  it  is  the  propor- 
tionate amount  of  additive  variance  (i.e.  the  heritability)  that  chiefly 
determines  the  best  breeding  method  to  be  used  for  improvement. 
An  understanding  of  the  causes  of  resemblance  between  relatives  is 
therefore  fundamental  to  the  practical  study  of  metric  characters  and 
to  its  application  in  animal  and  plant  improvement.  In  this  chapter, 
therefore,  we  shall  examine  the  causes  of  resemblance  between  rela- 
tives, and  show  in  principle  how  the  amount  of  additive  variance  can 
be  estimated  from  the  observed  degree  of  resemblance,  leaving  the 
more  practical  aspects  of  the  estimation  of  the  heritability  for  con- 
sideration in  the  next  chapter. 

In  the  last  chapter  we  saw  how  the  phenotypic  variance  can  be 
partitioned  into  components  attributable  to  different  causes.  These 
components  we  shall  call  causal  components  of  variance,  and  denote 
them  as  before  by  the  symbol  V.  The  measurement  of  the  degree  of 
resemblance  between  relatives  rests  on  the  partitioning  of  the  pheno- 
typic variance  in  a  different  way,  into  components  corresponding  to 
the  grouping  of  the  individuals  into  families.  These  components  can 
be  estimated  directly  from  the  phenotypic  values  and  for  this  reason 
we  shall  call  them  observational  components  of  phenotypic  variance, 
and  denote  them  by  the  symbol  ct2  in  order  to  keep  the  distinction 
clear.  Consider,  for  example,  the  grouping  of  individuals  into 
families  of  full  sibs.  By  the  analysis  of  variance  we  can  partition  the 
total  observed  variance  into  two  components,  within  groups  and 
between  groups.  The  within-group  component  is  the  variance  of 
individuals  about  their  group  means,  and  the  between-group  com- 
ponent is  the  variance  of  the  "true"  means  of  the  groups  about  the 
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population  mean.  The  true  mean  of  a  group  is  the  mean  estimated 
without  error  from  a  very  large  number  of  individuals.  An  explana- 
tion of  the  estimation  of  these  two  components  will  be  given,  with 
examples,  in  the  next  chapter.  Now,  the  resemblance  between  related 
individuals,  i.e.  between  full  sibs  in  the  case  under  discussion,  can  be 
looked  at  either  as  similarity  of  individuals  in  the  same  group,  or  as 
difference  between  individuals  in  different  groups.  The  greater  the 
similarity  within  the  groups,  the  greater  in  proportion  will  be  the 
difference  between  the  groups.  The  degree  of  resemblance  can 
therefore  be  expressed  as  the  between-group  component  as  a  pro- 
portion of  the  total  variance.  This  is  the  intra-class  correlation  coeffi- 
cient and  is  given  by 


oi 


oB-tow 


where  <j%  is  the  between-group  component  and  o>  the  within-group 
component.  (It  is  customary  to  use  the  symbol  t  for  the  intra-class 
correlation  of  phenotypic  values  in  order  to  avoid  confusion  with 
other  types  of  correlation  for  which  the  symbol  r  is  used.)  The 
between-group  component  expresses  the  amount  of  variation  that  is 
common  to  members  of  the  same  group,  and  it  can  equally  well  be 
referred  to  as  the  covariance  of  members  of  the  groups.  In  the  case  of 
the  resemblance  between  offspring  and  parents  the  grouping  of  the 
observations  is  into  pairs  rather  than  groups;  one  parent,  or  the  mean 
of  two  parents,  paired  with  one  offspring  or  the  mean  of  several 
offspring.  It  is  then  more  convenient  to  compute  the  covariance  of 
offspring  with  parents  from  the  sum  of  cross-products,  rather  than 
from  the  between-pair  component  of  variance.  With  offspring- 
parent  relationships,  also,  it  is  usually  more  convenient  to  express  the 
degree  of  resemblance  as  the  regression  coefficient  of  offspring  on 
parent,  instead  of  the  correlation  between  them,  the  regression  being 
given  by 


'OP 


covOF 


where  covOY  is  the  covariance  of  offspring  and  parents,  and  o-J  is  the 
variance  of  parents. 

Thus,  the  covariance  of  related  individuals  is  the  new  property 
of  the  population  that  we  have  to  deduce  in  seeking  the  cause  of 
resemblance  between  relatives,  whether  sibs  or  offspring  and  parents. 

L  F.Q.G. 
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The  covariance,  being  simply  a  portion  of  the  total  phenotypic 
variance,  is  composed  of  the  causal  components  described  in  the  last 
chapter,  but  in  amounts  and  proportions  differing  according  to  the 
sort  of  relationship.  By  finding  out  how  the  causal  components  con- 
tribute to  the  covariance  we  shall  see  how  an  observed  covariance  can 
be  used  to  estimate  the  causal  components  of  which  it  is  composed. 

Both  genetic  and  environmental  sources  of  variance  contribute  to 
the  covariance  of  relatives.  We  shall  consider  the  genetic  causes  of 
resemblance  first,  then  the  environmental  causes,  and  finally,  by 
putting  the  two  causes  together,  arrive  at  the  phenotypic  covariance 
and  the  degree  of  resemblance  that  can  be  observed  from  measure- 
ments of  phenotypic  values.  A  general  description  of  the  covariance, 
applicable  to  any  sort  of  relationship,  is  given  by  Kempthorne 
(1955a).  Here  we  shall  consider  only  four  sorts  of  relationship:  (1) 
between  offspring  and  one  parent,  (2)  between  half  sibs,  (3)  between 
offspring  and  the  mean  of  the  two  parents,  and  (4)  between  full  sibs. 
These  are  the  most  important  relationships  in  practice.  Identical 
twins  will  be  considered  in  the  next  chapter,  because  the  problems 
they  raise  will  be  better  understood  then. 


Genetic  Covariance 

Our  object  now  is  to  deduce  from  theoretical  considerations  the 
covariance  of  relatives  arising  from  genetic  causes,  neglecting  for  the 
time  being  any  non-genetic  causes  of  resemblance  that  there  may  be. 
This  means  that  we  have  to  deduce  the  covariance  of  the  genotypic 
values  of  the  related  individuals.  This  will  be  done  by  reference  to 
two  alleles  at  a  locus,  but  the  conclusions  are  equally  valid  for  loci 
with  any  number  of  alleles.  We  shall  at  first  omit  interaction  deviations 
and  the  interaction  component  of  variance  from  consideration,  but 
we  shall  describe  its  effects  briefly  later. 

Offspring  and  one  parent.  The  covariance  to  be  deduced  is 
that  of  the  genotypic  values  of  individuals  with  the  mean  genotypic 
values  of  their  offspring  produced  by  mating  at  random  in  the  popu- 
lation. If  values  are  expressed  as  deviations  from  the  population 
mean,  then  the  mean  value  of  the  offspring  is  by  definition  half  the 
breeding  value  of  the  parent,  as  explained  in  Chapter  7.  Therefore 
the  covariance  to  be  computed  is  that  of  an  individual's  genotypic 
value  with  half  its  breeding  value,  i.e.  the  covariance  of  G  with  \A. 


GENETIC  COVARIANCE 


153 


Chap.  9] 

Since  G=A+D  (D  being  the  dominance  deviation)  the  covariance 
is  that  of  (A+D)  with  \A.  Taking  the  sum  of  cross-products,  we 
have 

sum  of  cross-products =Z\A(A  +D) 
=  ±ZA2  +  \ZAD 

Since  A  and  D  are  uncorrelated  (see  p.  125),  the  term  \ZAT>  is 
zero.  Then  if  we  divide  both  sides  by  the  number  of  paired  observa- 
tions we  have 


cov01>  =  iVA 


(9-0 

since  ZA2  is  the  sum  of  squares  of  breeding  values.  The  genetic 
covariance  of  offspring  and  one  parent  is  therefore  half  the  additive 
variance. 

The  covariance  may  be  derived  by  another  method,  which  though 
less  concise  is  perhaps  more  explicit.  Table  9.1  gives  the  genotypes 
of  the  parents,  their  frequencies  in  the  population,  and  their  geno- 
typic  values  expressed  as  deviations  from  the  population  mean  (from 
Table  7.3).  The  right-hand  column  gives  the  mean  genotypic  values 


Parents 


Table  9.1 


Offspring 


Genotype 

Frequency 

Genotypic  value 

Mean  genotypic  value 

AA 

p2 

2q(oc-qd) 

qoc 

AXA2 

2pq 

(q  -p)oc  +  zpqd 

Vs-p)* 

A2A2 

<? 

-  2p{tx  +pd) 

—pa 

of  the  offspring,  which  are  half  the  breeding  values  of  the  parents  as 
given  in  Table  7.3.  The  covariance  of  offspring  and  parent  is  then  the 
mean  cross-product,  and  is  obtained  by  multiplying  together  the 
three  columns — frequency  x  genotypic  value  of  parent  x  genotypic 
value  of  offspring — and  summing  over  the  three  genotypes  of  the 
parents.  After  collecting  together  the  terms  in  a2  and  the  terms  in  ocd 
we  obtain 

covOY  =pq<x2(p2  +  Zpq  +  q2)  +  2p2q2ad(  -q  +  q-p  +p) 
=pq<x2 
=  Wa 

since  from  equation  £.5,  VA  =  zpqa.2.  Summing  over  all  loci  we  again 
reach  the  conclusion  that  the  covariance  of  offspring  and  one  parent 
is  equal  to  half  the  additive  variance. 
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Half  sibs.  Half  sibs  are  individuals  that  have  one  parent  in  com- 
mon and  the  other  parent  different.  A  group  of  half  sibs  is  therefore 
the  progeny  of  one  individual  mated  at  random  and  having  one 
offspring  by  each  mate.  Thus  the  mean  genotypic  value  of  the  group 
of  half  sibs  is  by  definition  half  the  breeding  value  of  the  common 
parent.  The  covariance  is  the  variance  of  the  means  of  the  half-sib 
groups,  and  is  therefore  the  variance  of  half  the  breeding  values  of  the 
parents;  this  is  a  quarter  of  the  additive  variance: 

CW(BB)  =  V*A=hVA  (9-2) 

This  covariance  also  can  be  demonstrated  by  the  longer  method, 
from  the  values  already  given  in  Table  9.1.  The  covariance  is  the 
variance  of  the  means  of  the  groups  of  offspring  listed  in  the  right- 
hand  column.  Squaring  the  offspring  values  and  multiplying  by  their 
frequencies  we  get 

Variance  of  means  of  half-sib  families 

=p2q2*2  +  Zpq.  l(q  -p)2oc2  +  q2p2<x2 
=pqoc2[pq  +  i(q-p)2+pq] 

=pq«2ii(P+q)2] 

=  ipq*2 
Therefore,  since  zpqoc2  =  VA  (from  equation  8.5), 

coV(m)=lVA 

summation  being  made  over  all  loci. 

Offspring  and  mid-parent.  The  covariance  of  the  mean  of  the 
offspring  and  the  mean  of  both  parents  (commonly  called  the  * 'mid- 
parent")  may  be  deduced  in  the  following  way.  Let  O  be  the  mean  of 
the  offspring,  and  P  and  P'  be  the  values  of  the  two  parents.  Then 
we  want  to  find  cov0t>\  that  is,  the  covariance  of  O  with  |(P  +  P'). 
This  is  equal  to  \{cov0^  +  covov>).  If  P  and  P'  have  the  same  variance, 
then  covov  =  covov>  and  cov0?  =  covOY.  Thus,  provided  the  two  sexes 
have  equal  variances,  the  covariance  of  offspring  and  mid-parent  is 
the  same  as  that  of  offspring  with  one  parent,  which  we  have  seen  is 
equal  to  half  the  additive  variance.  This  conclusion  may  be  extended 
to  other  sorts  of  relative:  the  covariance  of  any  individual  with  the 
mean  value  of  a  number  of  relatives  of  the  same  sort  is  equal  to  its 
covariance  with  one  of  those  relatives. 

The  longer  method  of  demonstrating  the  covariance  of  offspring 
with  mid-parent  is  rather  laborious,  but  it  must  be  given  since  it  will 
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be  needed  for  arriving  at  the  covariance  of  full  sibs.  We  shall,  how- 
ever, omit  some  of  the  steps  of  algebraic  reduction.  A  table  (Table 
9.2)  is  made  in  the  same  manner  as  for  offspring  and  one  parent,  but 
now  we  have  to  tabulate  types  of  mating  and  their  frequencies,  in- 
stead of  single  parents.  This  was  done  in  Chapter  1  (Table  1.1). 
Against  each  type  of  mating  we  put  the  mean  genotypic  value  of  the 
two  parents,  i.e.  the  mid-parent  value;  then  the  genotypes  of  the  pro- 
geny and  the  mean  genotypic  value  of  the  progeny.  The  working  is 
made  easier  by  writing  the  genotypic  values  in  terms  of  a  and  d 
instead  of  as  deviations  from  the  population  mean.  In  the  last  two 
columns  of  the  table  we  put  the  product  of  progeny-mean  x  mid- 
parent,  and  the  square  of  the  progeny  for  later  use.  Now,  to  get  the 
covariance  of  progeny-mean  and  mid-parent  value,  we  take  the  pro- 
duct of  progeny-mean  x  mid-parent  and  multiply  it  by  the  frequency 
of  the  mating  type,  and  then  sum  over  mating  types.  This  gives  the 
mean  product  (M.P.)  from  which  we  have  to  deduct  a  correction  for 
the  population  mean,  since  values  are  not  here  expressed  as  deviations 
from  the  mean.  The  correction  is  simply  the  square  of  the  population 
mean  (M2)  since  the  means  of  parents  and  of  progeny  are  equal. 
Both  the  M.P.  and  M2  contain  terms  in  a2,  in  ad,  and  in  d2.  By  col- 
lecting together  these  terms  and  simplifying  a  little  we  obtain 

M.P.  =  a2[p3(p  +q)+  q\p  +  q)]  +  2adpq(p2  -  q2)  +  d2pq(p2  +  2pq  +  q2) 
M2  =  a\p2  -  2pq + q2)  +  \adpq(p  -  q)  ■  +  \d2p  2q2 

Then,  cov0^  =  M.P.-M2 

=  a2pq  -  2adpq(p  -q)  +  d2pq(p  -  q)2 

=pq[a  +  d(q-p)]2 

=pqoc2 

=Wa  (9-3) 

when  summed  over  all  loci. 

So  the  genetic  covariance  of  offspring  with  the  mean  of  their  parents 
is  equal  to  half  the  additive  genetic  variance.  That  this  covariance 
comes  out  the  same  as  that  of  offspring  and  one  parent  need  cause  no 
surprise  when  we  note  that  the  variance  of  mid-parent  values  is  half 
the  variance  of  individual  values  (see  below,  p.  162). 

Full  sibs.  The  covariance  of  full  sibs  is  the  variance  of  the  means 
of  full-sib  families,  and  is  got  with  little  additional  work  from  Table 
9.2.  The  last  column  shows  the  squares  of  progeny  means  and  it  will 
be  seen  that  these  squares  are  all  exactly  the  same  as  the  products  of 
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progeny-mean  x  mid-parent,  except  for  the  two  entries  in  the  middle 
involving  terms  in  d2.  The  mean  square  (M.S.)  can  therefore  be  got 
from  the  mean  product  (M.P.)  already  calculated,  thus 

M.S.=M.?.+d2.2p2q2-id\4p2q2 
=  M.¥.+dyq2 

The  correction  for  the  mean  is  the  same  as  before,  so  we  have 


covim 


-  coV(y§  +  d2p2q2 
-pqo.2  +  d2p2q2 


Since  2pqcx.2  =  VA  (from  equation  8.5)  and  ^d2p2q2 
8.6)  the  covariance  of  full  sibs  is 


VD  (from  equation 


(94) 


covm)=iVA  +  lVD 

summing  over  all  loci. 

So  the  genetic  covariance  of  full  sibs  is  equal  to  half  the  additive 
genetic  variance  plus  a  quarter  of  the  dominance  variance.  This  is  the 
only  one  of  the  relationships  that  we  have  considered  where  we  find 
the  dominance  variance  contributing  to  the  resemblance.  The  reason 
is  that  full  sibs  have  both  parents  in  common,  and  a  pair  of  full  sibs 
have  a  quarter  chance  of  having  the  same  genotype  for  any  locus. 

Covariance  due  to  epistatic  interaction.  Before  we  turn  to  the 
environmental  causes  of  resemblance  between  relatives  let  us  briefly 
examine  the  role  of  interaction  variance  arising  from  epistasis.  In 
Chapter  8  we  noted  that  the  interaction  variance,  VIy  is  subdivided 
into  components  according  to  the  number  of  loci  interacting,  and 
according  to  whether  the  interaction  is  between  breeding  values  or 
dominance  deviations.  The  covariances  of  relatives,  with  the  contri- 
butions of  the  two-factor  interactions  included,  are  shown  in  Table  9.3 

Table  9.3 
Covariances  of  relatives  including  the   contributions  of 


two-factor  interactions. 


Relatives 


Variance  components  and  the  coeffi- 
cients of  their  contributions 


Offspring-parent:  cov0?  ■■ 
Half  sibs:  covms) 

Full  sibs:  covcfs) 

General:  cov  ■■ 


V, 


V 


AA 


V 


AD 


DD 


1 

4 
_1_ 
16 

1 

4 


xy 


1 

16 

1,2 
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(for  details  see  Kempthorne,  19550,  b).  The  offspring-parent  co- 
variance  refers  equally  to  one  parent  and  to  mid-parent  values. 
For  the  sake  of  clarity  the  components  of  variance  are  shown  at  the 
heads  of  the  columns  and  their  coefficients  in  the  covariances  are 
listed  below.  For  example,  the  offspring-parent  covariance  is 
i^A+l  Vaa>  The  contributions  of  interaction  to  the  covariances  are 
expressible  in  a  simple  general  form,  shown  in  the  bottom  line  of  the 
table.  If  the  covariance  contains  xVA  then  it  contains  also  xWAA\  and 
if  it  contains  yVD  it  contains  also  xyVAD  and  y2VDD.  Interactions  in- 
volving more  than  two  loci  contribute  progressively  smaller  propor- 
tions as  the  number  of  loci  increases.  The  effect  of  the  interaction 
variance  on  the  resemblance  between  relatives  is,  in  principle,  that 
the  offspring-parent  covariance  is  not  twice  the  half-sib  covariance, 
but  a  little  more  than  twice;  and  that  the  excess  of  the  full-sib  co- 
variance  over  the  half-sib  represents  not  only  dominance  variance  but 
also  some  of  the  interaction  variance. 

When  the  interaction  variance  was  first  discussed  in  Chapter  8 
we  said  we  would  regard  it  as  a  complication  to  be  circumvented, 
noting  only  the  consequences  of  neglecting  it.  These  consequences 
are  now  apparent.  First,  only  small  fractions  of  it  contribute  to  the 
covariances  and  therefore  its  effect  on  the  resemblance  between  rela- 
tives is  small  unless  the  amount  of  interaction  variance  is  large  in 
comparison  with  the  other  components.  And  second,  it  appears  that 
there  is  little  we  can  do  in  practice  except  ignore  it,  because,  apart 
from  the  special  experimental  methods  mentioned  on  p.  139,  there  is 
no  practicable  means  of  separating  the  interaction  from  the  other 
components.  The  consequences  of  ignoring  the  interaction  variance 
are  thus  that  any  estimate  of  VA  made  from  offspring-parent  regres- 
sions will  contain  also  \VAA  +  \VAAA  +etc;  any  estimate  of  VA  from 
half-sib  correlations  will  contain  also  iVAA+T6VAAA-\-etc;  and  any 
estimate  of  VD  obtained  from  a  full-sib  correlation  will  contain  also 
portions  of  the  interaction  components.  We  noted  in  Chapter  7  that 
the  two  definitions  of  breeding  value  given  there  are  not  equivalent 
if  there  is  interaction  between  loci.  We  can  now  see  how  this  comes 
about.  Defined  in  terms  of  the  measured  values  of  progeny — the 
practical  definition — breeding  value  includes  additive  x  additive 
interaction  deviations  in  addition  to  the  average  effects  of  the  genes 
carried  by  the  parents;  whereas,  defined  in  terms  of  the  average 
effects  of  genes — the  theoretical  definition — it  does  not. 

Effect  of  linkage.  Throughout  the  discussion  of  the  covariances 


Chap.  9] 


GENETIC  COVARIANCE 


159 


of  relatives  we  have  ignored  the  effects  of  linkage,  assuming  always 
that  the  loci  concerned  segregate  independently.  The  effects  of 
linkage  in  a  random-mating  population,  where  the  coupling  and 
repulsion  phases  are  in  equilibrium,  are  as  follows  (Cockerham, 
1956a).  The  covariances  of  offspring  and  parents  are  not  affected, 
but  the  covariances  of  half  and  full  sibs  are  increased;  the  closer  the 
linkage  the  greater  the  increase.  The  additional  covariance  due  to 
linkage  appears  with  the  interaction  component.  Therefore  what  is 
formally  attributed  to  epistatic  interaction  may  be  in  part  due  to 
linkage. 


Environmental  Covariance 


Genetic  causes  are  not  the  only  reasons  for  resemblance  between 
relatives;  there  are  also  environmental  circumstances  that  tend  to 
make  relatives  resemble  each  other,  some  sorts  of  relatives  more  than 
others.  If  members  of  a  family  are  reared  together,  as  with  human 
families  or  litters  of  pigs  or  mice,  they  share  a  common  environment. 
This  means  that  some  environmental  circumstances  that  cause 
differences  between  unrelated  individuals  are  not  a  cause  of  difference 
between  members  of  the  same  family.  In  other  words  there  is  a  com- 
ponent of  environmental  variance  that  contributes  to  the  variance 
between  means  of  families  but  not  to  the  variance  within  the  families, 
and  it  therefore  contributes  to  the  covariance  of  the  related  individuals. 
This  between-group  environmental  component,  for  which  we  shall 
use  the  symbol  VEcy  is  usually  called  the  common  environment,  a  term 
that  seems  more  appropriate  when  we  think  of  the  component  as  a 
cause  of  similarity  between  members  of  a  group  than  when  we  think 
of  it  as  a  cause  of  difference  between  members  of  different  groups. 
The  remainder  of  the  environmental  variance,  which  we  shall  denote 
by  VEw,  arises  from  causes  of  difference  that  are  unconnected  with 
whether  the  individuals  are  related  or  not.  It  therefore  appears  in 
the  within-group  component  of  variance,  but  does  not  contribute  to 
the  between-group  component,  which  is  the  variance  of  the  true 
means  of  the  groups.  In  considerations  of  the  resemblance  between 
relatives,  therefore,  the  environmental  variance  must  be  divided  into 
two  components: 


Vn=VMB+V 


Ew 


(9-5) 
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one  of  the  components,  VEcy  contributing  to  the  covariance  of  the 
related  individuals. 

The  sources  of  common  environmental  variance  are  many  and 
varied,  and  only  a  few  examples  can  be  mentioned.  Soil  conditions 
may  differentiate  families  of  plants  when  the  members  of  a  family  are 
grown  together  on  the  same  plot:  similarly  the  conditions  of  the  cul- 
ture medium  may  differentiate  families  of  Drosophila  or  other  small 
animals.  With  farm  animals,  related  individuals  are  likely  to  have 
been  reared  on  the  same  farm,  and  differences  of  climate  or  of  manage- 
ment contribute  to  the  resemblance  between  the  relatives.  "Maternal 
effects"  are  a  frequent  source  of  environmental  difference  between 
families,  especially  with  mammals.  The  young  are  subject  to  a 
maternal  environment  during  the  first  stages  of  their  life,  and  this 
influences  the  phenotypic  values  of  many  metric  characters  even 
when  measured  on  the  adult,  causing  offspring  of  the  same  mother  to 
resemble  each  other.  Finally,  members  of  the  same  family  tend  to  be 
contemporaneous,  and  changes  of  climatic  or  nutritional  conditions 
tend  to  differentiate  members  of  different  families.  This  source  of 
common  environmental  variation  is  especially  important  in  animals 
that  produce  their  young  in  broods  or  litters. 

These  various  sources  of  common  environmental  variation  con- 
tribute chiefly  to  the  resemblance  between  sibs,  though  some  may 
also  cause  resemblance  between  parent  and  offspring.  Maternal 
effects,  in  particular,  often  cause  a  resemblance  between  mother  and 
offspring  as  well  as  among  the  offspring  themselves.  Body  size  in  mice 
and  other  mammals  provides  an  example.  Large  mothers  tend  to 
provide  better  nutrition  for  their  young,  both  before  and  after  birth, 
than  small  mothers.  Therefore  the  young  of  large  mothers  tend  to 
grow  faster,  and  the  effect  of  the  rapid  early  growth  may  persist,  so 
that  when  adult  their  body  size  is  larger.  Thus  mothers  and  offspring 
tend  to  resemble  each  other  in  body  size. 

It  will  be  seen  from  the  examples  given  that  the  nature  of  the 
component  of  variance  due  to  common  environment  differs  according 
to  the  circumstances.  What  we  designate  as  the  VEc  component 
depends  on  the  way  in  which  individuals  are  grouped  when  we  esti- 
mate the  observational  components  of  phenotypic  variance.  What- 
ever the  form  of  the  analysis,  the  part  of  the  variance  between  the 
means  of  groups  that  can  be  ascribed  to  environmental  causes  is 
called  the  VEc  component.  The  nature  of  this  component  thus 
depends  on  the  form  of  the  analysis  applied.    If  the  groups  in  the 


Chap.  9] 


ENVIRONMENTAL  COVARIANCE 


161 


analysis  are  full-sib  families  then  the  VEc  component  represents 
environmental  causes  of  similarity  between  full  sibs;  if  the  groups  are 
half  sibs  it  represents  causes  of  similarity  between  half  sibs.  And  in 
parent-offspring  relationships  a  comparable  covariance  term  repre- 
sents environmental  causes  of  resemblance  between  offspring  and 
parent.  Thus,  whenever  we  measure  a  phenotypic  covariance  with 
the  object  of  using  it  to  estimate  a  causal  component  of  variance  we 
have  to  decide  whether  it  includes  an  appreciable  component  due  to 
common  environment,  and  this  is  often  a  matter  of  judgment  based 
on  a  biological  understanding  of  the  organism  and  the  character.  In 
experiments,  much  of  the  VEc  component  can  often  be  eliminated  by 
suitable  design.  For  example,  members  of  the  same  family  need  not 
always  be  reared  in  the  same  vial,  cage,  or  plot;  they  can  be  random- 
ised over  the  rearing  environments.  Or,  by  replication,  the  VEc 
component  can  be  measured  and  suitable  allowance  made  for  it  in  the 
resemblance  between  the  relatives. 

Thus  relatives  of  all  sorts  may  in  principle  be  subject  to  an  en- 
vironmental source  of  resemblance.  In  what  follows,  however,  we 
shall  make  the  simplification  of  disregarding  the  VEc  component  for 
all  relatives  except  full  sibs,  though  from  time  to  time  we  shall  put  in 
a  reminder  of  its  possible  presence.  Full  sibs  are  subject  to  a  com- 
mon maternal  environment  and  this  is  often  the  most  troublesome 
source  of  environmental  resemblance  to  overcome  by  experimental 
design.  Consequently  a  VEc  component  contributes  more  often  and 
in  greater  amount  to  the  covariance  of  full  sibs  than  to  that  of  any 
other  sort  of  relative.  The  simplification  of  disregarding  all  other 
sources  of  common  environmental  variance  is  therefore  not  entirely 
unrealistic. 


Phenotypic  Resemblance 

The  covariance  of  phenotypic  values  is  the  sum  of  the  covariances 
arising  from  genetic  and  from  environmental  causes.  Thus  by 
putting  together  the  conclusions  of  the  two  preceding  sections  we 
arrive  at  the  phenotypic  covariances  given  in  Table  9.4.  (It  will  be 
remembered  that  some  possible  sources  of  environmental  covariance 
are  being  neglected,  particularly  in  offspring-parent  relationships 
involving  the  mother.)  In  all  these  relationships  except  that  of  full 
sibs  the  covariance  is  either  a  half  or  a  quarter  of  the  additive  genetic 
variance.  By  observing  the  phenotypic  covariance  of  relatives  we  can 
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thus  estimate  the  amount  of  additive  variance  in  the  population  and 
make  the  partition  of  the  variance  into  additive  versus  the  rest. 

To  arrive  at  the  degree  of  resemblance  expressed  as  a  regression  or 
correlation  coefficient  we  have  to  divide  the  covariance  by  the  appro- 
priate variance.  The  resemblance  between  sibs  is  expressed  as  a 
correlation  and  the  covariance  is  divided  by  the  total  phenotypic 
variance.  The  correlation  between  half  sibs,  for  example,  is  therefore 
\VAjVP.  The  resemblance  between  offspring  and  parent  is  expressed 

Table  9.4 
Phenotypic  Resemblance  between  Relatives 
Relatives 


Offspring  and  one  parent 
Offspring  and  mid-parent 
Half  sibs 
Full  sibs 

as  the  regression  of  offspring  on  parent,  and  the  covariance  is  there- 
fore divided  by  the  variance  of  parents.  In  the  case  of  single  parents 
this  is  again  the  phenotypic  variance,  and  the  regression  of  offspring 
on  one  parent  is  thus  \VA\VP.  In  a  random-breeding  population  the 
phenotypic  variance  of  parents  and  offspring  is  the  same,  and  then  the 
correlation  between  offspring  and  one  parent  is  the  same  as  the  re- 
gression. The  case  of  mid-parent  values,  however,  is  a  little  different. 
The  covariance  has  to  be  divided  by  the  variance  of  mid-parent  values, 
and  this  is  half  the  phenotypic  variance,  for  the  following  reason.  Let 
X  and  Y  stand  for  the  phentoypic  values  of  male  and  female  parents 
respectively.  Then  Gx  =  oy=VP.  The  mid-parent  value  is  \X+\Y. 
and  the  variance  of  mid-parent  values,  assuming  X  and  Y  to  be 
uncorrelated,  is  therefore  u£x  +  °Vf  =  ia\x  =  2  •  \^x  —  2  Vp-  Thus 
the  regression  of  offspring  on  mid-parent  is  \V A\\V P  =  VAjVP.  The 
correlation  between  offspring  and  mid-parent  values,  however,  is 
2  ^/crP<70)  where  op  and  cr0  are  the  square  roots  of  the  phenotypic  vari- 
ances of  mid-parents  and  offspring  respectively,  and  this  is  not  the 
same  as  the  regression  of  offspring  on  mid-parent. 


Covariance 

Regression  (b) 

or  correlation  (t) 

Wa 

b-*vP 

Wa 

Wa 

1    WP 

Wa+Wd  +  VBc 

,    Wa+Wd  +  Vec 

*~                    T/ 
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The  regressions  of  offspring  on  parents  and  the  correlations  of 
sibs  are  shown  in  Table  9.4.  All  except  the  full-sib  correlation  are 
simple  fractions  of  the  ratio  VAjVP.  Thus  the  different  degrees  of 
resemblance  between  different  sorts  of  relatives  become  apparent. 
For  example,  the  regression  of  offspring  on  one  parent  is  twice  the 
correlation  between  half  sibs,  and  the  correlation  between  full  sibs  is 
twice  the  correlation  between  half  sibs  if  there  is  no  dominance  and 
no  common  environment. 

The  difference  between  the  full-sib  covariance  and  twice  the 
half-sib  covariance  can,  in  principle,  be  used  to  estimate  the  domin- 
ance variance,  VDi  provided  there  is  no  variance  due  to  common 
environment,  though  some  of  the  variance  due  to  epistatic  interaction 
would  be  included,  as  may  be  seen  from  Table  9.3.  In  practice, 
however,  it  is  usually  very  difficult  to  be  certain  that  there  is  no 
variance  due  to  common  environment,  and  estimates  of  the  domin- 
ance variance  obtained  in  this  way  are  generally  to  be  regarded  as 
upper  limits  rather  than  as  precise  estimates. 

Table  9.5 
The  Resemblance  between  Relatives  for  some  Characters  in  Man 


Correlation 

coefficient 

Parent- 

Character 

Reference 

offspring 

Full  sib 

Stature 

(1) 

•51 

•53 

Span 

(1) 

•45 

•54 

Length  of  forearm 

(1) 

•42 

•48 

Intelligence 

(2) 

•49 

•49 

Birth  weight 

(3) 

— 

•50 

(1)  Pearson  and  Lee  (1903). 

(2)  Unweighted   averages   of  several   estimates,    cited   by 
Penrose  (1949). 

(3)  Quoted  from  Robson  (1955). 


The  chief  use  of  measurements  of  the  degree  of  resemblance 
between  relatives  is  to  estimate  the  proportionate  amount  of  additive 
genetic  variance,  V A\  VP,  which  is  the  heritability .  The  meaning  of  the 
heritability  and  the  methods  of  estimating  it  will  be  considered  more 
fully  in  the  next  chapter.  To  conclude  this  chapter  we  give  in  Table 
9.5  some  examples  of  correlations  between  relatives  in  man.   These 
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are  undoubtedly  complicated  by  covariance  due  to  common  en- 
vironment, and  also  by  assortative  mating.  The  correlation  between 
husband  and  wife  for  intelligence,  for  example,  is  as  high  as  0*58 
(see  Penrose,  1949).  For  these  reasons  human  correlations  cannot 
easily  be  used  to  partition  the  variation  into  its  components. 


I 


CHAPTER    10 


HERITABILITY 

The  heritability  of  a  metric  character  is  one  of  its  most  important 
properties.  It  expresses,  as  we  have  seen,  the  proportion  of  the  total 
variance  that  is  attributable  to  the  average  effects  of  genes,  and  this  is 
what  determines  the  degree  of  resemblance  between  relatives.  But 
the  most  important  function  of  the  heritability  in  the  genetic  study 
of  metric  characters  has  not  yet  been  mentioned,  namely  its  predictive 
role,  expressing  the  reliability  of  the  phenotypic  value  as  a  guide  to 
the  breeding  value.  Only  the  phenotypic  values  of  individuals  can 
be  directly  measured,  but  it  is  the  breeding  value  that  determines  their 
influence  on  the  next  generation.  Therefore  if  the  breeder  or  experi- 
menter chooses  individuals  to  be  parents  according  to  their  pheno- 
typic values,  his  success  in  changing  the  characteristics  of  the  popu- 
lation can  be  predicted  only  from  a  knowledge  of  the  degree  of  corre- 
spondence between  phenotypic  values  and  breeding  values.  This 
degree  of  correspondence  is  measured  by  the  heritability,  as  the  fol- 
lowing considerations  will  show. 

The  heritability  is  defined  as  the  ratio  of  additive  genetic  variance 
to  phenotypic  variance: 


h2  = 


V, 


.(lO.l) 


(The  customary  symbol  h2  stands  for  the  heritability  itself  and  not  for 
its  square.  The  symbol  derives  from  Wright's  (1921)  terminology, 
where  h  stands  for  the  corresponding  ratio  of  standard  deviations.) 
An  equivalent  meaning  of  the  heritability  is  the  regression  of  breeding 
value  on  phenotypic  value: 

h2=bAP  i10-2) 

The  equivalence  of  these  meanings  can  be  seen  from  reasoning  similar 
to  that  by  which  we  derived  the  genetic  covariance  of  offspring  and 
one  parent  on  p.  153.  If  we  split  the  phenotypic  value  into  breeding 
value  and  a  remainder  (R)  consisting  of  the  environmental,  domin- 
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ance,  and  interaction  deviations,  thenP=A+R.  Since  A  and  R  are 
uncorrelated,  covAP  =  VA  and  so  bAP  =  VAjVP. 

We  may  note  also  that  the  correlation  between  breeding  values 
and  phenotypic  values,  rAP,  is  equal  to  the  square  root  of  the  heri- 
tability.  This  follows  from  the  general  relationship  between  corre- 
lation and  regression  coefficients,  which  gives 

»     op 
rAP—°Ap— 

<*A 

=h  (10.3) 

By  regarding  the  heritability  as  the  regression  of  breeding  value 
on  phenotypic  value  we  see  that  the  best  estimate  of  an  individual's 
breeding  value  is  the  product  of  its  phenotypic  value  and  the  heri- 
tability: 

^(expected)  =  h*P  (IO.4) 

breeding  values  and  phenotypic  values  both  being  reckoned  as 
deviations  from  the  population  mean.  In  other  words  the  heritability. 
expresses  the  reliability  of  the  phenotypic  value  as  a  guide  to  the 
breeding  value,  or  the  degree  of  correspondence  between  phenotypic 
value  and  breeding  value.  For  this  reason  the  heritability  enters  into 
almost  every  formula  connected  with  breeding  methods,  and  many 
practical  decisions  about  procedure  depend  on  its  magnitude.  These 
matters,  however,  will  be  considered  in  the  next  chapters;  here  we 
are  concerned  only  to  point  out  that  the  determination  of  the  heri- 
tability is  one  of  the  first  objectives  in  the  genetic  study  of  a  metric 
character. 

It  is  important  to  realise  that  the  heritability  is  a  property  not 
only  of  a  character  but  also  of  the  population  and  of  the  environ- 
mental circumstance  to  which  the  individuals  are  subjected.  Since 
the  value  of  the  heritability  depends  on  the  magnitude  of  all  the  com- 
ponents of  variance,  a  change  in  any  one  of  these  will  affect  it.  All 
the  genetic  components  are  influenced  by  gene  frequencies  and  may 
therefore  differ  from  one  population  to  another,  according  to  the  past 
history  of  the  population.  In  particular,  small  populations  maintained 
long  enough  for  an  appreciable  amount  of  fixation  to  have  taken  place 
are  expected  to  show  lower  heritabilities  than  large  populations. 
The  environmental  variance  is  dependent  on  the  conditions  of  culture 
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or  management:  more  variable  conditions  reduce  the  heritability, 
more  uniform  conditions  increase  it.  So,  whenever  a  value  is  stated 
for  the  heritability  of  a  given  character  it  must  be  understood  to  refer 
to  a  particular  population  under  particular  conditions.  Values  found 
in  other  populations  under  other  circumstances  will  be  more  or  less 
the  same  according  to  whether  the  structure  of  the  population  and  the 
environmental  conditions  are  more  or  less  alike. 

Very  many  determinations  of  heritabilities  have  been  made  for  a 
variety  of  characters,  chiefly  in  farm  animals.  Some  representative 
examples  are  given  in  Table  io.i.  Different  determinations  of  the 
heritability  of  the  same  character  show  a  considerable  range  of  varia- 
tion. This  is  partly  due  to  statistical  sampling,  but  some  of  the 
variation  reflects  real  differences  between  the  populations  or  the 
conditions  under  which  they  are  studied.  For  these  reasons,  and  be- 
cause estimations  of  heritabilities  can  seldom  be  very  precise,  the 
figures  quoted  in  the  table  are  rounded  to  the  nearest  5  per  cent. 
From  Table  10. 1  it  can  be  seen  that  the  magnitude  of  the  heritability 
shows  some  connexion  with  the  nature  of  the  character.  On  the 
whole,  the  characters  with  the  lowest  heritabilities  are  those  most 
closely  connected  with  reproductive  fitness,  while  the  characters 
with  the  highest  heritabilities  are  those  that  might  be  judged  on  bio- 
logical grounds  to  be  the  least  important  as  determinants  of  natural 
fitness.  This  is  well  seen  in  the  gradation  of  the  four  characters  of 
Drosophila. 

Table  io.i 

Approximate  values  of  the  heritability  of  various  characters 
in  domestic  and  laboratory  animals. 

Cattle 

Amount  of  white  spotting  in  Friesians  (Briquet  and  Lush,  1947)  -95 

Butterfat  %  (Johansson,  1950)  -6 

Milk-yield  (Johansson,  1950)  -3 

Conception  rate  (in  1st  service)  (A.  Robertson,  1957a)  -oi 


'igs 

Thickness  of  back  fat  (Fredeen  and  Jonsson,  1957)  *55 

Body  length  (Fredeen  and  Jonsson,  1957)  -5 

Weight  at  180  days  (Whatley,  1942)  «3 

Litter  size  (Lush  and  Molln,  1 942)  •  1 5 

{Continued  overleaf) 
M  F.Q.G. 
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Sheep  (Australian  Merino) 

Length  of  wool  (Morley,  1955)  *55 

Weight  of  fleece  (Morley,  1955)  *4 

Body  weight  (Morley,  1955)  *35 

Poultry  (White  Leghorn) 

Egg  weight  (Lerner  and  Cruden,  195 1)  *6 

Age  at  laying  of  first  egg  (King  and  Henderson,  19546)  *5 
Egg-production  (annual,  of  surviving  birds)  (King  and  Henderson, 

I954&)  *3 

Egg-production   (annual,    of  all  birds)  (King  and  Henderson, 

19546)  -2 

Body  weight  (Lerner  and  Cruden,  1951)  *2 

Viability  (Robertson  and  Lerner,  1949)  *i 

Rats 

Expression  of  hooded  gene  (amount  of  white)  (from  data  of  Castle 

and  Wright,  1 9 1 6)  -4 

Ovary  response  to  gonadotrophic  hormone  (Chapman,  1946)                -35 

Age  at  puberty  in  females  (Warren  and  Bogart,  1952)  -15 

Mice 

Tail  length  at  6  weeks  (Falconer,  19546)  -6 

Body  weight  at  6  weeks  (Falconer,  1953)  -35 

Litter  size  (1st  litters)  (Falconer,  1955)  '15 

Drosophila  melanogaster 

Abdominal   bristle   number   (Clayton,    Morris,    and  Robertson, 

1957)  '5 

Body  size  (thorax  length)  (F.  W.  Robertson,  19576)  -4 

Ovary  size  (F.  W.  Robertson,  1957a:)  -3 

Egg  production  (F.W.  Robertson,  19576)  -2 


Estimation  of  Heritability 

Let  us  first  compare  the  merits  of  the  different  sorts  of  relatives 
for  estimating  either  the  additive  genetic  variance  from  the  covariance, 
or  the  heritability  from  the  regression  or  correlation  coefficient. 
Table  10.2  shows  again  the  composition  of  the  phenotypic  covariances, 
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and  shows  also  the  regression  or  correlation  expressed  in  terms  of  the 
heritability.   The  choice  depends  on  the  circumstances.   In  addition 


Table  10.2 


Relatives 

Offspring  and  one  parent 
Offspring  and  mid-parent 
Half  sibs 
Full  sibs 


Co-variance 


tv a 


Wa 
Wa+Wd+VEo 


Regression  (b)  or 
correlation  (t) 

b  =  \W 
b=h2 

t  =  \h* 
t>ih* 


to  the  practical  matter  of  which  sorts  of  relatives  are  in  fact  obtain- 
able, there  are  two  points  to  consider — sampling  error  and  environ- 
mental sources  of  covariance.  The  statistical  precision  of  the  estimate 
depends  on  the  experimental  design  and  also  on  the  magnitude  of  the 
heritability  being  estimated,  and  so  no  hard  and  fast  rule  can  be 
made.  The  matter  of  statistical  precision  will  be  further  considered 
in  a  later  section  of  this  chapter.  The  question  of  environmental 
sources  of  covariance  is  generally  more  important  than  the  statistical 
precision  of  the  estimate,  because  it  may  introduce  a  bias  which 
cannot  be  overcome  by  statistical  procedure.  From  considerations  of 
the  biology  of  the  character  and  the  experimental  design  we  have  to 
decide  which  covariance  is  least  likely  to  be  augmented  by  an  en- 
vironmental component,  a  matter  already  discussed  in  the  last 
chapter.  Generally  speaking  the  half-sib  correlation  and  the  regres- 
sion of  offspring  on  father  are  the  most  reliable  from  this  point  of 
view.  The  regression  of  offspring  on  mother  is  sometimes  liable  to 
give  too  high  an  estimate  on  account  of  maternal  effects,  as  it  would, 
for  example,  with  body  size  in  most  mammals.  The  full-sib  corre- 
lation, which  is  the  only  relationship  for  which  an  environmental 
component  of  covariance  is  shown  in  the  table,  is  the  least  reliable  of 
all.  The  component  due  to  common  environment  is  often  present  in 
large  amount  and  is  difficult  to  overcome  by  experimental  design; 
and  the  full-sib  covariance  is  further  augmented  by  the  dominance 
variance.  The  full-sib  correlation  can  therefore  seldom  do  more  than 
set  an  upper  limit  to  the  heritability. 

Example  io.i.  The  heritability  of  abdominal  bristle  number  in 
Drosophila  melanogaster  has  been  determined  by  three  different  methods, 
applied  to  the  same  population  (Clayton,  Morris,  and  Robertson,  1957), 
with  the  following  results: 
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Method  of  estimation  Heritability 

Offspring-parent  regression  *5 1  ±  '07 

Half-sib  correlation  -48  ±  •  1 1 

Full-sib  correlation  -53  ±  '07 

Combined  estimate  ^52 

The  estimates  obtained  by  the  three  methods  are  in  very  satisfactory 
agreement.  In  this  case,  the  character — bristle  number — is  free  of  com- 
plications arising  from  maternal  effects  and  common  environment. 

Let  us  now  consider  briefly  some  technical  matters  concerning  the 
translation  of  observational  data  into  estimates  of  heritability.  We 
shall  deal  first  with  the  estimation  of  the  heritability;  and  we  shall 
later  discuss  the  standard  error  of  the  estimate,  and  the  design  that 
gives  an  experiment  its  greatest  precision. 

Selection  of  parents  and  assortative  mating.  In  the  treatment 
of  resemblance  between  relatives  we  have  supposed  the  parents  to  be 
a  random  sample  of  their  generation  and  to  be  mated  at  random.  Quite 
often,  however,  one  or  other  of  these  conditions  does  not  hold,  and 
the  choice  of  which  sort  of  relative  to  use  in  the  estimation  of  herita- 
bility is  then  somewhat  restricted.  In  experimental  and  domesticated 
populations  the  parents  are  often  a  selected  group  and  consequently 
the  phenotypic  variance  among  the  parents  is  less  than  that  of  the 
population  as  a  whole  and  less  than  that  of  the  offspring.  The  regres- 
sion of  offspring  on  parents,  however,  is  not  affected  by  the  selection 
of  parents  because  the  covariance  is  reduced  to  the  same  extent  as  the 
the  variance  of  the  parents,  so  that  the  slope  of  the  regression  line  is 
unaltered.  Thus  the  regression  of  offspring  on  one  parent  is  a  valid 
measure  of  J/?2,  and  that  of  offspring  on  mid-parent  is  a  valid  measure 
of  h2.  But  the  covariance  is  not  a  valid  measure  of  VAy  nor  the  vari- 
ance of  parents  of  VP\  moreover,  the  correlation  and  regression  coeffi- 
cients are  not  equal. 

Sometimes  the  mating  of  parents  is  not  made  at  random  but 
according  to  their  phenotypic  resemblance,  a  system  known  as 
assortative  mating.  There  is  then  a  correlation  between  the  pheno- 
typic values  of  the  mated  pairs.  The  consequences  of  assortative 
mating  are  described  by  Reeve  (19556)  but  they  are  too  complicated 
to  explain  in  detail  here.  They  can  be  deduced  by  modification  of 
Table  9.2,  the  frequencies  of  the  different  types  of  mating  being 
altered  according  to  the  correlation  between  the  mated  pairs.  The 
variance  of  mid-parent  values  is  increased  and  consequently  also  the 
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covariance  of  full  sibs.  The  regression  of  offspring  on  mid-parent, 
however,  is  very  little  affected  and  it  can  be  taken  as  a  valid  measure 
of  h2.  The  increased  variance  of  mid-parent  values  under  assortative 
mating  has  the  practical  advantage  of  reducing  the  sampling  error  of 
the  regression  coefficient  and  thus  of  the  estimate  of  heritability. 

Offspring-parent  relationship.  The  estimation  of  heritability 
from  the  regression  of  offspring  on  parent  is  comparatively  straight- 
forward and  needs  little  comment  apart  from  the  points  mentioned  in 
the  preceding  paragraphs.  The  data  are  obtained  in  the  form  of 
measurements  of  parents  and  the  mean  values  of  their  offspring.  The 
covariance  is  then  computed  in  the  usual  way  from  the  cross-products 
of  the  paired  values.  The  mean  values  of  offspring  may  be  weighted 
according  to  the  number  of  offspring  in  each  family,  if  the  numbers 
differ.  The  appropriate  weighting  is  discussed  by  Kempthorne  and 
Tandon  (1953)  and  by  Reeve  (1955c). 


Fig.  10.  i.  Regression  of  offspring  on  mid-parent  for  wing-length 
in  Drosophila,  as  explained  in  Example  10.2.  Mid-parent  values  are 
shown  along  the  horizontal  axis,  and  mean  value  of  offspring  along 
the  vertical  axis.  (Drawn  from  data  kindly  supplied  by  Dr  E.  C. 
R.  Reeve.) 

Example  10.2.  Fig.  10. 1  illustrates  the  regression  of  offspring  on 
mid-parent  values  for  wing  length  in  Drosophila  melanogaster  (Reeve  and 
Robertson,  1953).  There  are  37  pairs  of  parents  and  a  mean  of  273 
offspring  were  measured  from  each  pair  of  parents.  The  parents  were 
mated  assortatively,  with  the  result  that  the  variance  of  mid-parent  values 
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is  greater  than  it  would  be  if  mating  had  been  at  random.  Each  point  on 
the  graph  represents  the  mean  value  of  one  pair  of  parents  (measured  along 
the  horizontal  axis),  and  the  mean  value  of  their  offspring  (measured  along 
the  vertical  axis).  The  axes  are  marked  at  intervals  of  i/ioo  mm.,  and  they 
intersect  at  the  mean  value  of  all  parents  and  all  offspring.  The  sloping 
line  is  the  linear  regression  of  offspring  on  mid-parent.  The  slope  of  this 
line  estimates  the  heritability,  and  has  the  value  ( ±  standard  error): 

h2=b0?  =  0-577  ±0-07 

A  complication  in  the  use  of  the  regression  of  offspring  on  mid- 
parent  arises  if  the  variance  is  not  equal  in  the  two  sexes.  We  noted 
in  the  previous  chapter  that  the  genetic  covariance  of  offspring  and 
mid-parent  is  equal  to  half  the  additive  variance  on  condition  that  the 
sexes  are  equal  in  variance.  If  this  is  not  so,  the  regression  on  mid- 
parent  cannot,  strictly  speaking,  be  used,  and  the  heritability  must 
be  estimated  separately  for  each  sex  from  the  regression  of  daughters 
on  mothers  and  of  sons  on  fathers.  If  the  heritabilities  are  found  to 
be  equal  in  the  two  sexes,  then  a  joint  estimate  can  be  made  from  the 
regression  on  mid-parent,  by  taking  the  mean  value  of  the  offspring 
as  the  unweighted  mean  of  males  and  females. 

Sib  analysis.  The  estimation  of  heritability  from  half  sibs  is 
more  complicated  than  appears  at  first  sight  and  needs  more  detailed 
comment.  A  common  form  in  which  data  are  obtained  with  animals 
is  the  following.  A  number  of  males  (sires)  are  each  mated  to  several 
females  (dams),  and  a  number  of  offspring  from  each  female  are 
measured  to  provide  the  data.  The  individuals  measured  thus  form  a 
population  of  half-sib  and  full-sib  families.  An  analysis  of  variance 
is  then  made  by  which  the  phenotypic  variance  is  divided  into  ob- 
servational components  attributable  to  differences  between  the  pro- 
geny of  different  males  (the  between-sire  component,  u2s);  to  differ- 
ences between  the  progeny  of  females  mated  to  the  same  male 
(between-dam,  within-sires,  component,  v%)\  and  to  differences 
between  individual  offspring  of  the  same  female  (within-progenies 
component,  o-j^).  The  form  of  the  analysis  is  shown  in  Table  10.3. 
There  are  supposed  to  be  s  sires,  each  mated  to  d  dams,  which 
produce  k  offspring  each.  The  values  of  the  mean  squares  are  de- 
noted by  MSS,  MSDi  and  MSW.  The  mean  square  within  progenies 
is  itself  the  estimate  of  the  within-progeny  variance  component, 
vw\  but  tne  other  mean  squares  are  not  the  variance  components. 
The  compositions  of  the  mean  squares  in  terms  of  the  observational 
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components  of  variance  are  shown  in  the  right-hand  column  of  the 
table,  consideration  of  which  will  show  how  the  variance  components 
are  to  be  estimated.  The  between-dam  mean  square,  for  example,  is 
made  up  of  the  within-progeny  component  together  with  k  times  the 
between-dam  component;  so  the  between-dam  component  is  esti- 
mated as  vi ~{ijk){MSD  -  MSW),  i.e.  we  deduct  the  mean  square  for 
progenies  from  the  mean  square  for  dams  and  divide  by  the  number 
of  offspring  per  dam.  Similarly  the  between-sire  component  is 
estimated  as  os  =  {ijdk)(MSs  -  MSD),  where  dk  is  the  number  of  off- 

Table  10.3 
Form  of  Analysis  of  Half-Sib  and  Full-Sib  Families 


Composition  oj 

Source 

d.f. 

Mean  Square 

Mean  Square 

Between  sires 

S-I 

MSS 

=  c?w  +  ko%  +  dkal 

Between  dams 

s(d-i) 

MSn 

=  a^  +  kal 

(within  sires) 

Within  progenies 

sd(k-i) 

MSW 

=  aw 

s  =  number  of  sires 

d  =  number  of  dams  per  sire 

k  =  number  of  offspring  per  dam 

spring  per  sire.  If  there  are  unequal  numbers  of  offspring  from  the 
dams,  or  of  dams  in  the  sire  groups,  the  exact  solution,  which  is 
described  by  King  and  Henderson  (1954a),  Williams  (1954),  and 
Snedecor  (1956,  section  10.17)  becomes  too  complicated  for  descrip- 
tion here.  We  can,  however,  use  the  mean  values  of  d  and  k  with 
little  error,  provided  the  inequality  of  numbers  is  not  very  great. 

The  next  step  is  to  deduce  the  connexions  between  the  observa- 
tional components  that  have  been  estimated  from  the  data  and  the 
causal  components,  in  particular  the  additive  genetic  variance,  the 
estimation  of  which  is  the  main  purpose  of  the  analysis.  Though  all 
the  information  needed  has  already  been  given,  the  interpretation  of 
the  observational  components,  which  is  given  in  Table  10.4,  is  not 
immediately  apparent  without  explanation.  The  first  point  to  note 
is  that  the  estimate  of  the  phenotypic  variance  is  given  by  the  sum 
(o-y)  of  the  three  observational  components:  VP  =  0%  =  0%  +  0%  +  crj^. 
This  is  not  necessarily  equal  to  the  observed  variance  as  estimated 
from  the  total  sum  of  squares,  though  the  two  seldom  differ  by  much. 
Now  consider  the  interpretation  of  the  between-sire  component, 
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g%.  This  is  the  variance  between  the  means  of  half-sib  families  and 
it  therefore  estimates  the  phenotypic  covariance  of  half  sibs,  cov(mi)y 
which  is  \VA.  Thus  o\  =  \VA.  Next  consider  the  within-progeny 
component,  o-^.  Since  any  between-group  variance  component  is 
equal  to  the  covariance  of  the  members  of  the  groups,  it  follows  that  a 
within-group  component  is  equal  to  the  total  variance  minus  the 
covariance  of  members  of  the  groups.  The  progenies  of  the  dams  are 

Table  10.4 

Interpretation  of  the  observational  components  of  variance 
in  a  sib  analysis 


Observational  component 


Covariance  and  causal  components 
estimated 


Sires: 

°l  = 

Dams: 

ol  = 

Progenies: 

„1 
crw  = 

Total:      4  = 

ffs  +  ^-f 

aw  = 

Sires  +  Dams: 

^  +  o-J  = 

=Wa+W»  +  vEc 
=Wa+Wi>+vEV} 
=  vA  +  vJ>+vEe+vEw 

=  WA+iVn+VEe 


cov(aB) 
Vp-cov^) 

vP 

cov{m) 

full-sib  families  and  so  the  within-progeny  variance  estimates 
VP - coV(FS).  This  leads  to  the  interpretation  o>  =\VA  +%VD  +  VEw. 
Finally,  there  remains  the  between-dam  component,  and  what  it 
estimates  can  be  found  by  subtraction  as  follows: 

^D  =  ^T-^s-^w=cov{m  -cov(K$)=IVa  +  IVd+  VEc 

Consideration  of  the  between-sire  and  between-dam  components  will 
show  that  their  sum  gives  an  estimate  of  the  full-sib  covariance, 
co<v(fs)>  Dut  this  provides  no  new  information  for  estimating  the  causal 
components.  These  conclusions  about  the  connexion  between  ob- 
servational and  causal  components  of  variance  are  summarised  in 
Table  10.4.  The  contributions  of  the  interaction  variance  to  the 
observational  components  is  given  by  Kempthorne  (1955(2),  and 
can  be  deduced  from  the  contributions  to  the  covariances  given  in 
Table  9.3. 

Example  10.3.  As  an  illustration  of  the  estimation  of  heritability  from 
a  sib  analysis  we  refer  to  the  study  of  Danish  Landrace  pigs  based  on  the 
records  of  the  Danish  Pig  Progeny  Testing  Stations  (Fredeen  and  Jonsson, 
1957).  The  data  came  from  468  sires  each  mated  to  2  dams,  the  analysis 
being  made  on  the  records  of  2  male  and  2  female  offspring  from  each 
dam.  Only  one  such  analysis  is  given  here:  that  of  body  length  in  the  male 
offspring.  The  analysis,  shown  in  the  table,  was  made  within  stations  and 
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within  years,  and  this  accounts  for  the  degrees  of  freedom  being  fewer  than 
would  appear  appropriate  from  the  numbers  stated  above.  The  interpre- 
tation of  the  analysis,  shown  at  the  foot  of  the  table,  has  been  slightly 

Sib  analysis  of  body  length  in  Danish  Landrace  pigs;  data 
for  male  offspring  only  (from  Fredeen  and  Jonsson,  1957). 


Source 

d.f. 

Mean  Square 

Component  of  variance 

Between  sires 

432 

6-03 

^=1(6*03  -3-8i)  =  o-555 

Between  dams, 

within  sires 

468 

3.81 

^  =  i(3*81-  2-87)  =  0-47 

Within  progenies 

936 

2-87 

a2w=                               2-87 

4=                    3-895 

Interpretation  of  analysis 
Sib  correlations  Estimates  of  heritability 


Half  sibs:  t^ 


})  =  — -2        =0*142     Sire-component:   h2  =  :~ 

(J rp  O '  rp 


Dam-component:  h2 


4°"j 

Grp 


=  0-57 
=  0-48 


Full  sibs:  t(FS) 


2  ,    2 
crf 


=  0-263      Sire  +  Dam: 


h*  =  *M+J®   =o-S3 


2 

o> 


simplified  by  the  omission  of  some  minor  adjustments  not  relevant  for  us 
at  this  stage.  The  between-dam  component  is  not  greater  than  the  between- 
sire  component,  so  there  cannot  be  much  non-additive  genetic  variance  or 
variance  due  to  common  environment.  The  two  estimates  of  the  heri- 
tability, from  the  sire  and  dam  components  respectively,  can  therefore  be 
regarded  as  equally  reliable,  and  their  combination  based  on  the  resem- 
blance between  full  sibs  may  be  taken  as  the  best  estimate. 

Example  10.4.  We  have  not  yet  had  an  example  to  illustrate  the  effect 
of  common  environment  in  augmenting  the  full-sib  correlation.  This  is 
provided  by  body  size  in  mice.   The  analysis  given  in  table  (i)  refers  to  the 


Table  (i) 

Source 

d.f. 

Mean  Square     Composition  of  M.  S. 

Components 

Sires 

70 

17-10                ct£  +  k'a%  +  dk'ol 

0-1  =  0-48 

Dams 

118 

10-79                   <Tw  +  karl  + 

4  =  2-47 

Progenies 

527 

2-19                al 

0^  =  2-19 

6  =  3-48;  k'  =4-16;  ^  =  2-33 


4=5#I4 
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weight  of  female  mice  at  6  weeks  of  age  (J.  C.  Bowman,  unpublished). 
There  were  719  offspring  from  74  sires  and  192  dams,  each  with  one 
litter.  These  were  spread  over  4  generations  and  the  analysis  was  made 
within  generations.  The  analysis  is  complicated  by  the  inequality  of  the 
number  of  offspring  per  dam  and  of  dams  per  sire.  We  shall  not  attempt 
to  explain  the  adjustments  made  for  these  inequalities,  but  simply  give 
the  compositions  of  the  mean  squares  from  which  the  components  are 
estimated.  The  dam  component  is  much  greater  than  the  sire  component, 
indicating  a  substantial  amount  of  variance  due  to  common  environment. 
Therefore  only  the  sire  component  can  be  used  to  estimate  the  heritability. 
The  estimate  obtained  is  A2  =  4  x  0-48/5-14  =  0-37.  Let  us  now  use  the  analysis 
to  estimate  the  causal  components  according  to  the  interpretation  given 
in  Table  10.4,  but  with  the  assumption  that  non-additive  genetic  variance 
is  negligible  in  amount.   Table  (ii)  gives  the  estimates  and  shows  how  they 


Tab 

le  (ii) 

vF- 

=  <JT 

=  5*14  = 

100% 

vA- 

=4"! 

=  1-92  = 

37% 
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-«4- 
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=  1-99  = 

39% 
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„2 

-<JW- 

2o\ 

=  1-23  = 

24% 

are  derived.  The  percentage  contribution  of  each  component  to  the  total 
variance  is  given  in  the  right-hand  column.  It  will  be  seen  that  the  vari- 
ance due  to  common  environment  (Vec)  amounts  to  39  per  cent  of  the 
total,  and  is  greater  than  the  environmental  variance  within  full-sib 
families  (Vew)  which  amounts  to  only  24  per  cent  of  the  total. 


Intra-sire  regression  of  offspring  on  dam.  The  heritability 
can  be  estimated  from  the  offspring-parent  relationship  in  a  popula- 
tion with  the  structure  described  in  the  foregoing  section,  but  a  slight 
modification  is  necessary.  Since  each  male  is  mated  to  several  females, 
the  regression  of  offspring  on  mid-parent  is  inappropriate;  and,  since 
there  are  usually  rather  few  male  parents,  the  simple  regressions  on 
one  or  other  parent  are  both  unsuitable.  The  heritability  can,  how- 
ever, be  satisfactorily  estimated  from  the  average  regression  of  off- 
spring on  dams,  calculated  within  sire  groups.  That  is  to  say,  the 
regression  of  offspring  on  dam  is  calculated  separately  for  each  set  of 
dams  mated  to  one  sire,  and  the  regressions  from  each  set  pooled  in  a 
weighted  average.  This  method  is  commonly  used  for  the  estimation 
of  heritabilities  in  farm  animals.  The  intra-sire  regression  of  off- 
spring on  dam  estimates  half  the  heritability,  as  the  following  con- 
sideration will  show.   The  progeny  of  one  sire  has  a  mean  deviation 
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from  the  population  mean  equal  to  half  the  breeding  value  of  the  sire, 
provided  the  females  he  is  mated  to  are  a  random  sample  from  the 
population.  The  progeny  of  one  dam  deviates  from  the  mean  of  the 
sire  group  by  half  the  breeding  value  of  the  dam.  Therefore  the 
within-sire  covariance  of  offspring  and  dam  is  equal  to  half  the 
additive  variance  of  the  population  as  a  whole;  and  the  within-sire 
regression  of  offspring  on  dam  is  equal  to  half  the  heritability,  just 
like  the  simple  regression  of  offspring  on  one  parent.  The  validity 
of  the  estimate  is,  of  course,  dependent  on  the  absence  of  maternal 
effects  contributing  to  the  resemblance  between  daughters  and  dams. 
Inequality  of  the  variance  of  males  and  females  calls  for  an  adjustment 
if  the  heritability  is  to  be  estimated  from  the  intra-sire  regression  of 
male  offspring  on  dams.  The  regression  coefficient  should  then  be 
multiplied  by  the  ratio  of  the  phenotypic  standard  deviation  of  females 
to  that  of  males. 

Example  10.5.  The  heritability  of  abdominal  bristle-number  in 
Drosophila  melanogaster,  estimated  from  the  offspring-parent  regression, 
was  cited  in  Example  10.1.  This  was  in  fact  a  joint  estimate  based  on 
intra-sire  regressions  of  daughters  on  dams  and  of  sons  on  dams,  the  latter 
being  corrected  for  inequality  of  variance  in  the  two  sexes  (Clayton,  Morris, 
and  Robertson,  1957).  The  separate  regression  coefficients,  with  the  cor- 
rection for  inequality  of  variances,  and  the  estimates  of  the  heritability 
are  given  in  the  table. 

Estimate  of 
heritability 


Standard  deviation:  females 
Standard  deviation:  males 
Standard  deviation:  female/male 
Regression  coefficient:  daughter-dam 
Regression  coefficient:  son- dam 
Regression  coefficient:  son-dam  corrected 

0-206  x  1-17  = 
Joint  estimate,  as  given  in  Example  10.1, 


3*54 

3'03 
1-17 
0-269 
0*206 

0-241 


°*54 

0-48 
0-51 


The  Precision  of  Estimates  of  Heritability 


It  is  of  the  greatest  importance  to  know  the  precision  of  any  esti- 
mate of  heritability.  When  an  estimate  has  been  obtained  one  wants 
to  be  able  to  indicate  its  precision  by  the  standard  error.  And  when 


178  HERITABILITY  [Chap.  10 

an  experiment  aimed  at  estimating  a  heritability  is  being  planned  one 
wants  to  choose  the  method  and  design  the  experiment  so  that  the 
estimate  will  have  the  greatest  possible  precision  within  the  limita- 
tions imposed  by  the  scale  of  the  experiment.  The  precision  of  an 
estimate  depends  on  its  sampling  variance,  the  lower  the  sampling 
variance  the  greater  the  precision;  and  the  standard  error  is  the  square 
root  of  the  sampling  variance.  Estimates  of  heritability  are  derived 
from  estimates  of  either  a  regression  coefficient  or  an  intra-class  cor- 
relation coefficient,  and  the  sampling  variances  of  these  are  given  in 
textbooks  of  statistics.  We  shall  therefore  present  the  necessary 
formulae  without  explanation  of  their  derivation.  The  information  on 
the  design  of  experiments  given  here  is  derived  from  the  paper  by  A. 
Robertson  (19590)  on  this  subject. 

The  problems  of  experimental  design  are,  first,  the  choice  of 
method  and,  second,  the  decision  of  how  many  individuals  in  each 
family  are  to  be  measured.  Since  the  total  number  of  individuals 
measured  cannot  be  increased  indefinitely,  an  increase  of  the  number 
of  individuals  per  family  necessarily  entails  a  reduction  of  the  number 
of  families.  The  problem  is  therefore  to  find  the  best  compromise 
between  large  families  and  many  families.  In  assessing  the  relative 
efficiencies  of  different  methods  and  designs  we  have  to  compare 
experiments  made  on  the  same  scale;  that  is  to  say,  with  the  same 
total  expenditure  in  labour  or  cost.  We  must  therefore  decide  first 
what  are  the  circumstances  that  limit  the  scale  of  the  experiment.  If 
the  labour  of  measurement  is  the  limiting  factor,  as  for  example  in 
experiments  with  Drosophila,  then  the  limitation  is  in  the  total 
number  of  individuals  measured,  including  the  parents  if  they  are 
measured.  If,  on  the  other  hand,  breeding  and  rearing  space  is  the 
limiting  factor,  as  it  generally  is  with  larger  animals,  the  limitation 
may  be  either  in  the  number  of  families  or  in  the  total  number  of 
offspring  that  can  be  produced  for  measurement,  and  measurements 
of  the  parents  may  be  included  without  additional  cost.  We  cannot 
here  take  account  of  all  the  possible  ways  in  which  the  scale  of  the 
experiment  may  be  limited.  Therefore  for  the  sake  of  illustration  we 
shall  consider  only  a  limitation  of  the  total  number  of  individuals 
measured.  That  is  to  say,  we  shall  assume  the  total  number  of  in- 
dividuals measured  to  be  the  same  for  all  methods  and  all  experi- 
mental designs.  What  we  have  to  do,  then,  is  to  consider  each  method 
on  this  basis  and  see  what  design  and  which  method  will  give  an 
estimate  of  the  heritability  with  the  lowest  sampling  variance. 


Chap.  10]        THE  PRECISION  OF  ESTIMATES  OF  HERITABILITY 


179 


Offspring-parent  regression.  Consider  first  estimates  based  on 
the  regression  of  offspring  on  parents.  LetX  be  the  independent 
variate,  which  may  be  either  the  value  of  a  single  parent  or  the  mid- 
parent  value.  Let  Y  be  the  dependent  variate,  which  may  be  either  a 
single  offspring  of  each  parent  or  the  mean  of  n  offspring.  Let  crx 
and  oy  De  the  variances  of  X  and  Y  respectively;  let  b  be  the  regres- 
sion of  FonZ,  and  N  the  number  of  paired  observations  of  X  and  Y, 
which  is  equivalent  to  the  number  of  families  in  the  experiment. 
Let  T  be  the  total  number  of  individuals  measured,  which  is  fixed  by 
the  scale  of  the  experiment.  The  number  of  offspring  measured  is 
nN,  and  the  number  of  parents  N  or  zN  according  to  whether  the 
regression  is  on  one  parent  or  on  the  mid-parent  value.  So,  with  one 
parent  measured,  T=N(n  +  i)>  and  with  both  parents  measured 
T=N(n  +  2).  With  these  symbols,  the  variance  of  the  estimate  of  the 
regression  coefficient  is 


^AfhtiH 


(10.5) 


For  use  as  a  guide  to  design  this  formula  is  more  convenient 
if  put  in  a  simplified  and  approximate  form.  The  regression  coeffi- 
cient is  usually  small  enough  that  b2  can  be  ignored;  and  we  may  sup- 
pose that  N  is  fairly  large,  so  that  the  variance  of  the  estimate  may  be 
put,  approximately,  as 


2_  1     <4 


(approx.)     (10.6) 


When  only  one  parent  is  measured  the  variance  of  parental  values  is 
equal  to  the  phenotypic  variance,  i.e.  ux  =  VP.  When  both  parents 
are  measured  (provided  they  were  not  mated  assortatively)  the  vari- 
ance of  mid-parent  values  is  half  the  phenotypic  variance,  i.e. 
crx  —  iVp-  The  variance  of  the  offspring  values,  cry,  is  the  variance  of 
the  means  of  families  of  n  individuals.  This  depends  on  the  pheno- 
typic correlation,  t,  between  members  of  families,  in  a  manner  that 
will  be  explained  in  Chapter  13,  (see  Table  13.2),  where  it  will  be 
shown  that 

i+(n-i)t 

Gy=  Vp 

n 

Therefore  by  substitution  for  crx  and  gy  in  equation  10.6  the  sampling 
variance  of  the  regression  on  one  parent  becomes 
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°b  =  — k/V  (aPProxO  (10.7) 

and  that  of  the  regression  on  mid-parent  is  twice  as  great.  Since  the 
phenotypic  correlation,  t,  depends  on  the  heritability  it  will  not 
generally  be  known  at  the  time  an  experiment  is  being  planned. 
Therefore  the  best  design  cannot  be  exactly  determined  in  advance. 
We  can,  however,  get  an  approximate  idea  of  how  many  offspring  of 
each  parent  should  be  measured.  On  the  assumption  already  stated, 
that  the  total  number  of  individuals  measured  including  the  parents 
is  fixed,  it  can  be  shown  that  the  sampling  variance  given  in  equation 
10.7  is  minimal  when  n  =  J(i  -  t)jt  if  one  parent  is  measured  and  when 
n  =  \iz(i  -  t)jt  if  both  parents  are  measured.  Consider,  for  example,  a 
character  with  a  heritability  of  20  per  cent  and  no  variance  due  to 
common  environment,  so  that  the  phenotypic  correlation  in  full-sib 
families  is  t  =  o-i.  Then  the  optimal  family  size  works  out  to  be 
n  =  3  when  only  one  parent  is  measured  and  n=\  when  both  parents 
are  measured.  If  we  had  taken  a  higher  heritability  the  optimal  family 
size  would  have  been  lower.  Large  families  are  advantageous  only 
for  the  estimation  of  very  low  heritabilities.  For  example,  full-sib 
families  of  about  10  or  14  would  be  optimal  for  estimating  a  herit- 
ability of  2  per  cent. 

So  far  we  have  considered  only  the  sampling  variance  of  the 
regression  coefficient,  and  how  this  can  be  reduced  by  the  design  of 
the  experiment.  Now  let  us  consider  the  sampling  variance  of  the 
estimate  of  heritability,  so  that  we  can  compare  methods,  i.e.  the  use 
of  one  parent  or  of  mid-parent  values.  A  just  comparison  can  only 
be  made  on  the  assumption  of  the  optimal  design  for  each  method, 
and  therefore  we  can  only  illustrate  the  comparison  by  reference  to  a 
particular  case.  We  shall  consider  the  particular  case  mentioned 
above  where  the  phenotypic  correlation  is  £  =  o-i,  which  would  be 
found  in  full-sib  families  when  the  heritability  is  20  per  cent.  The 
optimal  family  sizes  are  3  or  4  as  stated  above.  For  the  purpose  of 
comparison  we  have  to  express  the  sampling  variance  of  the  regression 
coefficient  given  in  equation  10.7  in  terms  of  the  total  number  of 
individuals  measured,  T,  since  this  is  assumed  to  be  the  same  for  all 
methods.  We  therefore  substitute  in  equation  10.7  as  follows.  When 
one  parent  is  measured  N=  T\{n  +1),  and  n  =  3.  When  both  parents  are 
measured  N  —  Tj(n  +  2),  and  n  =  4.  Substitution  in  equation  10.7  then 
yields  0-6=4*  8/3  T when  one  parent  is  measured,  and  of  =  3  •  9/T  when  both 
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are  measured.  The  regression  on  one  parent  must  be  doubled  to  give 
the  estimate  of  heritability,  but  the  regression  on  mid-parent  is  itself 
the  estimate.  So  the  sampling  variances  of  the  estimates  of  herit- 
ability, in  the  special  case  under  consideration,  are: 

By  regression  on  one  parent:  o$  =  \o\  =  6-^/T    (approx.) 
By  regression  on  mid-parent:  0$  =   ol  —  y^jT    (approx.) 


Thus  the  estimate  based  on  mid-parent  values  has  considerably  less 
sampling  variance.  A  regression  on  mid-parent  values,  in  general, 
yields  a  more  precise  estimate  of  heritability  for  a  given  total  number 
of  individuals  measured. 

Sib  analyses.  Now  let  us  consider  estimates  obtained  from  the 
intra-class  correlation  of  full-sib  or  half-sib  families.  We  shall  at 
first  suppose  for  simplicity  that  half-sib  families  are  not  subdivided 
into  full-sib  families;  i.e.  that  only  one  offspring  from  each  dam  is 
measured  in  paternal  half-sib  families.  In  the  case  of  full-sib  families 
we  shall  assume  that  there  is  no  variance  due  to  common  environ- 
ment so  that  the  estimate  of  heritability  is  a  valid  one.  Let  N  be  the 
number  of  families,  and  n  the  number  of  individuals  per  family,  so 
that  the  total  number  of  individuals  measured  is  T=nN.  Let  the 
intra-class  correlation  be  t.  The  sampling  variance  of  the  intra-class 
correlation  is  then 


„      2[l+(?Z 
0?=    L 


i)t]%i-ty 


.(10.8) 


n(n-i)(N~i) 

When  the  value  of  T=nN  is  limited  by  the  size  of  the  experiment  it 
can  be  shown  that  the  sampling  variance  of  the  intra-class  correlation 
is  minimal  when  n  =  i/t,  approximately.  Therefore  the  optimal  family 
size  depends  on  the  heritability.  In  the  case  of  full-sib  families 
h2  =  2t,  and  in  the  case  of  half-sib  families,  h2=\t.  So  the  most 
efficient  design  has  the  following  family  sizes: 

2 
With  full-sib  families:  n—-^ 

h2 


With  half-sib  families:  n  —  -^ 

h2 

Since  prior  knowledge  of  the  heritability  will  be  at  the  best  only 
approximate,  the  optimal  family  size  cannot  be  exactly  determined 
before-hand.   The  loss  of  efficiency,  however,  is  much  greater  if  the 
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family  size  is  below  the  optimum  than  if  it  is  above.  It  is  therefore 
better  to  err  on  the  side  of  having  too  large  families.  A.  Robertson 
(1959a)  shows  that,  in  the  absence  of  prior  knowledge  of  the  herita- 
bility,  half-sib  analyses  should  generally  be  designed  with  families  of 
between  20  and  30. 

If  the  experiment  has  the  most  efficient  design,  with  n  =  ijt,  then 
the  sampling  variance  of  the  intra-class  correlation  is  approximately 

°t=f  {10.9) 

Therefore  under  optimal  design  the  sampling  variances  of  the  esti- 
mates of  heritability  are  as  follows: 

16A2 
From  full-sib  families:  0$  =  40?  =  —=-     (approx.) 

From  half-sib  families:  0$  =  1 6^  =  ^-=-     (approx.) 

Thus,  other  things  being  equal,  an  estimate  from  full-sib  families  is 
twice  as  precise  as  one  from  half-sib  families. 

At  this  point  let  us  compare  the  precision  of  estimates  from  sib 
analyses  with  those  from  offspring-parent  regressions,  assuming 
optimal  design  in  each  case.  Again  we  have  to  choose  a  specific  case 
for  illustration  of  the  comparison.  Let  us  for  simplicity  suppose  as  we 
did  before  that  the  heritability  to  be  estimated  is  20  per  cent.  And, 
though  perhaps  not  very  representative  of  situations  likely  to  arise  in 
practice,  let  us  compare  an  estimate  obtained  from  a  half-sib  analysis 
with  one  obtained  from  the  regression  of  offspring  on  one  parent 
when  the  offspring  consist  of  full-sib  families.  The  variance  of  the 
estimate  of  heritability  from  the  half-sib  analysis  would  then  be  6-/\./T 
by  substitution  in  the  formula  given  above,  and  from  the  regression  of 
offspring  on  one  parent  it  would  also  be  6'4/Tas  we  found  previously. 
In  this  case,  therefore,  these  two  methods  would  give  equally  precise 
estimates  for  a  given  total  number  of  individuals  measured.  If  we  had 
considered  a  higher  heritability,  then  the  regression  method  would 
have  had  the  lower  sampling  variance.  The  comparison  we  have  made, 
though  referring  to  a  particular  case,  illustrates  the  general  conclusion, 
which  is  that  the  regression  method  is  preferable  for  estimating 
moderately  high  heritabilities  and  the  sib  correlation  method  is 
preferable  for  low  heritabilities,  the  critical  heritability  being,  very 
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roughly,  about  20  per  cent  when  the  comparison  is  made  on  the  basis 
of  an  equal  total  number  of  individuals  measured. 

Finally  let  us  consider  briefly  a  sib  analysis  where  the  half-sib 
families  are  subdivided  into  full-sib  families.  The  situation  is  then 
more  complicated,  and  for  details  the  reader  should  consult  the  papers 
of  Osborne  and  Paterson  (1952)  and  A.  Robertson  (1959  a).  The 
conclusions  are  as  follows.  In  many  cases  the  estimation  of  heri- 
ability  will  be  based  only  on  the  between-sire  component,  i.e.  the 
half-sib  correlation.  This  will  arise  when  common  environment 
renders  the  full-sib  correlation  unsuitable.  The  most  efficient  design 
then  has  only  one  offspring  per  dam,  and  is  exactly  the  same  as  the 
half-sib  analysis  discussed  above.  If  there  is  no  common  environ- 
ment and  it  is  desired  to  estimate  the  correlations  from  sire  and  from 
dam  components  with  equal  precision,  then  the  optimal  design  has 
3  or  4  dams  per  sire  with  the  number  of  offspring  per  dam  equal  to 
z/h2.  In  the  absence  of  prior  knowledge  of  the  heritability  the  analysis 
should  be  planned  with  3  or  4  dams  per  sire,  and  10  offspring  per 
dam. 

Identical  Twins 

Identical  twins  seem  at  first  sight  to  provide,  for  man  and  cattle,  a 
means  of  estimating  the  genotypic  variance.  They  provide  individuals 
of  identical  genotype,  just  as  inbred  lines,  or  crosses  between  lines,  do 
for  laboratory  animals  or  for  plants.  The  phenotypic  variance  within 
pairs  of  identical  twins  should,  therefore,  estimate  the  environmental 
variance  and  so  allow  the  partition  of  the  phenotypic  variance  into 
genotypic  and  environmental  components  to  be  made.  (This  would 
not  estimate  the  heritability,  but  the  use  of  identical  twins  seems 
nevertheless  most  appropriately  discussed  at  this  point.)  Many 
studies  of  human  twins  have  been  made,  and  have  shown  the  mem- 
bers of  the  pairs  to  be  extremely  alike  in  most  characters,  even  when 
reared  apart  from  childhood  (see  Stern,  1949,  Ch.  23,  for  review  and 
references).  Studies  of  cattle  twins,  though  on  a  much  smaller  scale, 
show  the  same  thing  (see  Hancock,  1954;  Brumby,  1958).  Taken  at 
their  face  value  these  studies  seem  to  indicate  a  very  high  degree  of 
genetic  determination — up  to  90  per  cent  or  even  more — for  many 
characters.  The  use  of  identical  twins  in  this  way  is,  however,  vitiated 
by  the  additional  similarity  due  to  common  environment.  Twins 
share  a  common  environment  from  conception  to  birth,  and  over  the 
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period  during  which  they  are  reared  together,  so  that  the  within-pair 
variance  contains  only  a  part,  and  perhaps  only  a  small  part,  of  the 
total  environmental  variance.  This  difficulty  may  be  partially  over- 
come by  the  comparison  of  identical  with  fraternal  twins.  Fraternal 
twins  are  full  sibs  which  share  a  common  environment  to  approxi- 
mately the  same  extent  as  identical  twins.  Let  us  therefore  consider 
how  the  causal  components  of  variance  contribute  to  the  observa- 
tional components  between  pairs  and  within  pairs  for  the  two  sorts  of 
twins.  The  composition  of  the  observational  components  are  given 
in  Table  10.5,  the  between-pair  component  being  the  phenotypic 
covariance.  The  environmental  components  are  shown  as  being  the 
same  for  fraternal  as  for  identical  twins.  This  is  not  necessarily  true, 
but  one  can  proceed  only  on  the  assumption  that  it  is. 

Table  10.5 

Composition  of  the  components  of  variance  between  and 
within  pairs  of  twins. 

Between  pairs  Within  pairs 

Identicals  VA+    VD  +  VEc  VEw 

Fraternals  Wa+Wd  +  VEc         Wa+Wd  +  VEw 

Difference  Wa+Wd  Wa+Wd 

The  contributions  of  the  interaction  variance,  which  for  simplicity 
are  omitted,  can  be  added  from  Table  9.3  (p.  1 57).  If  the  environmental 
components  are  the  same  for  the  two  sorts  of  twins,  then  the  differ- 
ence between  identicals  and  fraternals  in  either  of  the  two  components 
estimates  half  the  additive  variance  together  with  three-quarters  of 
the  dominance  variance  (and  more  than  three-quarters  of  the  inter- 
action variance).  To  take  the  partitioning  further  it  is  necessary  to 
have  an  estimate  of  the  additive  variance,  reliably  free  from  admixture 
with  variance  due  to  common  environment.  By  subtraction  of  half 
the  additive  variance  we  may  then  obtain  an  estimate  of  three-quarters 
of  the  dominance  variance  together  with  more  than  three-quarters  of 
the  interaction  variance.  This  would  give  at  least  an  approximate  idea 
of  the  amount  of  non-additive  genetic  variance.  There  is,  however,  a 
difficulty  with  cattle  in  comparisons  between  identical  and  fraternal 
twins,  connected  again  with  the  environmental  components  of 
variance.  Vascular  anastomoses  frequently  occur  in  the  placentae  of 
both  sorts  of  twins,  so  that  the  blood  of  the  two  twins  is  mixed.  This 
will  not  make  identicals  any  more  alike,  but  it  may  make  fraternals 
more  alike  than  they  would  otherwise  be. 
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Some  results  of  twin-studies  are  quoted  in  Table  10.6,  in  order  to 
illustrate  the  degree  of  resemblance  between  identical  and  between 
(fraternal  twins  in  both  man  and  cattle.  The  difference  between  the 
I correlation  coefficients  of  identicals  and  fraternals,  given  in  the  right- 
hand  column,  could  be  taken  as  an  estimate  of  half  the  heritability  if 
there  were  no  non-additive  genetic  variance  and  if  there  were  no 
complications  arising  from  a  common  circulation.  But  since  non- 
additive  variance  cannot  reasonably  be  assumed  to  be  absent,  the 
difference  can  only  be  regarded  as  setting  an  upper  limit  to  half  the 
heritability.  The  vascular  anastomoses  in  cattle  twins  may,  however, 
render  the  estimates  of  the  heritability,  or  of  its  upper  limit,  too  low. 

Table  10.6 

Resemblance  between  Twins 

Correlation  coefficients 
Character  Reference  Identicals  Fraternals  Difference 


Man 

Height 
Weight 
Intelligence 
Birth  weight 

Cattle 

Milk-yield,  1st  lactation 
Butterfat-yield,  1st  lactation 
Fat  %  in  milk,  1st  lactation 
Weight  at  96  weeks 
Body  length  at  96  weeks 


« 

(1) 
(1) 
(2) 

(3) 


•93 
•92 

•88 
•67 


•91 

•90 
•95 

•83 
75 


•64 
•63 
•63 

•58 


•65 
■51 
•86 
78 
•62 


•29 
•29 

•25 
•09 


•26 

•39 
•09 
•05 
*J3 


(1)  Newman,  Freeman,  and  Holzinger  (1937).  Based  on  50  pairs  of 
identicals  and  50  pairs  of  fraternals,  corrected  for  age  differences. 

2)  Quoted  from  Robson  (1955). 

(3)  Brumby  and  Hancock  (1956).  Based  on  10  pairs  of  identicals  and  11 
pairs  of  fraternals. 


CHAPTER    ii 

SELECTION: 
I.  The  Response  and  its  Prediction 

Up  to  this  point  in  our  treatment  of  metric  characters  we  have  been 
concerned  with  the  description  of  the  genetic  properties  of  a  popula- 
tion as  it  exists  under  random  mating,  with  no  influences  tending  to 
change  its  properties;  now  we  have  to  consider  the  changes  brought 
about  by  the  action  of  breeder  or  experimenter.  There  are  two  ways, 
as  we  noted  in  Chapter  6,  in  which  the  action  of  the  breeder  can  change 
the  genetic  properties  of  the  population;  the  first  by  the  choice  of 
individuals  to  be  used  as  parents,  which  constitutes  selection,  and  the 
second  by  control  of  the  way  in  which  the  parents  are  mated,  which 
embraces  inbreeding  and  cross  breeding.  We  shall  consider  selection 
first,  and  in  doing  so  we  shall  ignore  the  effects  of  inbreeding,  even 
though  we  cannot  realistically  suppose  that  we  are  always  dealing 
with  a  population  large  enough  for  its  effects  to  be  negligible. 

The  basic  effect  of  selection  is  to  change  the  array  of  gene  fre- 
quencies in  the  manner  described  in  Chapter  2.  The  changes  of  gene 
frequency  themselves,  however,  are  now  almost  completely  hidden 
from  us  because  we  cannot  deal  with  the  individual  loci  concerned 
with  a  metric  character.  We  therefore  have  to  describe  the  effects  of 
selection  in  a  different  manner,  in  terms  of  the  observable  properties 
— means,  variances  and  covariances — though  without  losing  sight  of 
the  fact  that  the  underlying  cause  of  the  changes  we  describe  is  the 
change  of  gene  frequencies.  Before  we  come  to  details  let  us  consider 
the  change  of  gene  frequencies  a  little  further  in  general  terms. 

To  describe  the  change  of  the  genetic  properties  from  one  genera- 
tion to  the  next  we  have  to  compare  successive  generations  at  the  same 
point  in  the  life  cycle  of  the  individuals,  and  this  point  is  fixed  by  the 
age  at  which  the  character  under  study  is  measured.  Most  often  the 
character  is  measured  at  about  the  age  of  sexual  maturity  or  on  the 
young  adult  individuals.  The  selection  of  parents  is  made  after  the 
measurements,  and  the  gene  frequencies  among  these  selected  in- 
dividuals are  different  from  what  they  were  in  the  whole  population 
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before  selection.  If  there  are  no  differences  of  fertility  among  the 
selected  individuals  or  of  viability  among  their  progeny,  then  the  gene 
frequencies  are  the  same  in  the  offspring  generation  as  in  the  selected 
parents.  Thus  artificial  selection — that  is,  selection  resulting  from 
the  action  of  the  breeder  in  the  choice  of  parents — produces  its  change 
of  gene  frequency  by  separating  the  adult  individuals  of  the  parent 
generation  into  two  groups,  the  selected  and  the  discarded,  that  differ 
in  gene  frequencies.  Natural  selection,  operating  through  differences 
of  fertility  among  the  parent  individuals  or  of  viability  among  their 
progeny,  may  cause  further  changes  of  gene  frequency  between  the 
parent  individuals  and  the  individuals  on  which  measurements  are 
made  in  the  offspring  generation.  Thus  there  are  three  stages  at 
which  a  change  of  gene  frequency  may  result  from  selection:  the  first 
through  artificial  selection  among  the  adults  of  the  parent  generation; 
the  second  through  natural  differences  of  fertility,  also  among  the 
adults  of  the  parent  generation;  and  the  third  through  natural  differ- 
ences of  viability  among  the  individuals  of  the  offspring  generation. 
Though  natural  differences  of  fertility  and  viability  are  always  present 
they  are  not  necessarily  always  relevant,  because  they  are  not  neces- 
sarily connected  with  the  genes  concerned  with  the  metric  character. 


1 


Response  to  Selection 

The  change  produced  by  selection  that  chiefly  interests  us  is  the 
change  of  the  population  mean.  This  is  the  response  to  selection, 
which  we  shall  symbolise  by  R;  it  is  the  difference  of  mean  phenotypic 
value  between  the  offspring  of  the  selected  parents  and  the  whole  of 
the  parental  generation  before  selection.  The  measure  of  the  selec- 
tion applied  is  the  average  superiority  of  the  selected  parents,  which 
is  called  the  selection  differential,  and  will  be  symbolised  by  S.  It  is 
the  mean  phenotypic  value  of  the  individuals  selected  as  parents 
expressed  as  a  deviation  from  the  population  mean,  that  is  from  the 
mean  phenotypic  value  of  all  the  individuals  in  the  parental  genera- 
tion before  selection  was  made.  To  deduce  the  connexion  between 
response  and  selection  differential  let  us  imagine  two  successive 
generations  of  a  population  mating  at  random,  as  represented  dia- 
grammatically  in  Fig.  1 1 .  i .  Each  point  represents  a  pair  of  parents 
and  their  progeny,  and  is  positioned  according  to  the  mid-parent 
value  measured  along  the  horizontal  axis  and  the  mean  value  of  the 
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progeny  measured  along  the  vertical  axis.  The  origin  represents  the 
population  mean,  which  is  assumed  to  be  the  same  in  both  generations. 
The  sloping  line  is  the  regression  line  of  offspring  on  mid-parent. 
(A  diagram  of  this  sort,  plotted  from  actual  data  was  given  in  Fig. 
10. i.)  Now  let  us  regard  a  group  of  individuals  in  the  parental 
generation  as  having  been  selected — say  those  with  the  highest 
values.  These  pairs  of  parents  and  their  offspring  are  indicated  by 
solid  dots  in  the  figure.  The  parents  have  been  selected  on  the  basis 


Fig.  i i.i.  Diagrammatic  representation  of  the  mean  values  of 
progeny  plotted  against  the  mid-parent  values,  to  illustrate  the 
response  to  selection,  as  explained  in  the  text. 

of  their  own  phenotypic  values,  without  regard  to  the  values  of  their 
progeny  or  of  any  other  relatives.  (This  chapter  deals  exclusively 
with  selection  made  in  this  way:  other  methods  will  be  described  in 
Chapter  13.)  Let  S  be  the  mean  phenotypic  value  of  these  selected 
parents,  expressed  as  a  deviation  from  the  population  mean.  And 
similarly  let  R  be  the  mean  deviation  of  their  offspring  from  the 
population  mean.  Then  S  is  the  selection  differential  and  R  is  the 
response.  The  point  marked  by  the  cross  represents  the  mean  value 
of  the  selected  parents  and  of  their  progeny,  and  it  lies  on  the  regres- 
sion line.  The  regression  coefficient  of  offspring  on  parents  is  thus 
equal  to  R/S.  Therefore  the  connexion  between  response  and  selection 
differential  is 


R=bovS 


OP* 


.(11.1) 
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We  saw  in  the  last  chapter  that  the  regression  of  offspring  on  mid- 
parent  is  equal  to  the  heritability,  provided  there  is  no  non-genetic 
cause  of  resemblance  between  offspring  and  parents.  To  this  we  must 
add  the  further  condition  that  there  should  be  no  natural  selection: 
that  is  to  say,  that  fertility  and  viability  are  not  correlated  with  the 
phenotypic  value  of  the  character  under  study.  Provided  these 
conditions  hold,  therefore,  the  ratio  of  response  to  selection  differ- 
ential is  equal  to  the  heritability,  and  the  response  is  given  by 


R=h*S 


(II.2) 


The  connexion  between  the  response  and  the  selection  differen- 
tail,  expressed  in  equation  JJ.2,  follows  directly  from  the  meaning  of 
the  heritability.  We  noted  in  the  last  chapter  (equation  10.2)  that  the 
heritability  is  equivalent  to  the  regression  of  an  individual's  breeding 
value  on  its  phenotypic  value.  The  deviation  of  the  progeny  from 
the  population  mean  is,  by  definition,  the  breeding  value  of  the 
parents,  and  so  the  response  is  equivalent  to  the  breeding  value  of  the 
parents.  Thus  it  follows  that  the  expected  value  of  the  progeny  is 
given  by  R=h2S. 

There  is  one  point  at  which  the  situation  envisaged  in  deducing 
the  equations  of  response  does  not  coincide  with  what  is  actually 
done  in  selection.  We  supposed  the  individuals  of  the  parent  genera- 
tion to  have  mated  at  random  and  the  selection  to  have  been  applied 
subsequently.  In  practice,  however,  the  selection  is  usually  made 
before  mating,  on  the  basis  of  the  individuals'  values  and  not  the 
mid-parent  values.  The  effect  of  this  is  that  the  individuals,  when 
regarded  as  part  of  the  whole  parental  population,  have  been  mated 
assortatively.  Assortative  mating,  however,  has  very  little  effect  on 
the  offspring-parent  regression,  as  we  noted  in  the  last  chapter,  and 
this  feature  of  selection  procedure  can  therefore  be  disregarded. 

Prediction  of  response.  The  chief  use  of  these  equations  of 
response  is  for  predicting  the  response  to  selection.  Let  us  consider  a 
little  further  the  nature  of  the  prediction  that  can  be  made.  First,  it 
is  clear  that  equation  11.1  is  not  a  prediction  but  simply  a  description, 
because  the  regression  of  offspring  on  parent  cannot  be  measured 
until  the  offspring  generation  has  been  reared.  We  could,  however, 
measure  the  regression,  &0p,  in  a  previous  generation,  and  then  use 
the  equation  R=b0^S  to  predict  the  response  to  selection.  There  is 
no  genetics  involved  in  this;  it  is  simply  an  extrapolation  of  direct 
observation,  and  the  only  conditions  on  which  it  depends  are  the 
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absence  of  environmental  change  and  the  absence  of  genetic  change 
between  the  generations  from  which  the  regression  was  estimated  and 
the  generation  to  which  selection  is  applied.  The  equation  R=h2S, 
however,  provides  a  means  of  prediction  based  on  observations  made 
only  on  the  individuals  of  the  parent  generation  before  selection.  Its 
validity  rests  on  obtaining  a  reliable  estimate  of  h2  from  the  resem- 
blance between  relatives,  such  as  half  sibs;  and  on  the  truth  of  the 
identity  Z>0p  =  A2. 

Example  i  i  .  i  .  The  selection  for  abdominal  bristle  number  in  Droso- 
phila  melanogaster,  by  Clayton,'Morris,  and  Robertson  (1957),  will  provide 
an  illustration  of  the  prediction  of  the  response,  and  will  serve  also  to 
indicate  the  extent  of  the  agreement  between  observation  and  prediction. 
(The  data  for  this  example  were  kindly  supplied  by  Dr  G.  A.  Clayton.) 
The  heritability  of  bristle  number  was  first  estimated  from  the  base 
population  before  selection,  and  the  value  found  was  0-52,  as  stated  in 
Example  10.1.  Five  samples  of  100  males  and  100  females  were  taken  from 
the  base  population,  and  selection  for  high  and  for  low  bristle  number  was 
made  in  each  of  the  five  samples,  the  20  most  extreme  individuals  of  each 
sex  being  selected  as  parents.  The  mean  deviations  of  these  selected  indi- 
viduals from  the  mean  of  the  sample  out  of  which  they  were  selected  are 
given  in  the  table  in  the  columns  headed  S,  the  negative  signs  under  down- 
ward selection  being  omitted.  These  are  the  selection  differentials.  The 
expected  responses  are  obtained  by  multiplying  the  selection  differentials 
by  the  heritability,  according  to  equation  11. 2.   The  observed  responses 


Upward  selection 

Downward  selection 

Resp 

onse 

Response 

Line 

S 

Exp. 

Obs. 

S 

Exp.     Obs. 

1 

5'29 

275 

2-60 

4'63 

2-41      2-44 

2 

5-12 

2-66 

2-23 

4-58 

2-38      2-29 

3 

4'44 

2-31 

2'43 

4-36 

2-27      0-67 

4 

4-32 

2-25 

3-12 

5-60 

2-91      1-13 

5 

4-88 

2'54 

2-68 

4-12 

2-14     2-68 

Mean 

4-8 1 

2-50 

2'6l 

4-66 

2-42      1-84 

are  the  differences  between  the  progeny  means  and  the  sample  means  out 
of  which  the  parents  were  selected.  The  expected  and  observed  responses 
are  also  given  in  the  table,  negative  signs  being  again  omitted.  Comparison 
of  the  observed  with  the  expected  responses  shows  that  on  the  whole  there 
is  fairly  good  agreement,  though  in  some  lines — particularly  lines  3  and  4 
selected  downward — there  are  quite  serious  discrepancies.  These  dis- 
crepancies, which  are  typical  of  selection  experiments,  illustrate  the  fact  that 
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a  single  generation  of  selection  in  only  one  line  cannot  be  relied  on  to 
follow  the  prediction  at  all  closely. 


The  prediction  of  response  is  valid,  in  principle,  for  only  one 
generation  of  selection.  The  response  depends  on  the  heritability  of 
the  character  in  the  generation  from  which  the  parents  are  selected. 
The  basic  effect  of  the  selection  is  to  change  the  gene  frequencies,  so 
the  genetic  properties  of  the  offspring  generation,  in  particular  the 
heritability,  are  not  the  same  as  in  the  parent  generation.  Since  the 
changes  of  gene  frequency  are  unknown  we  cannot  strictly  speaking 
predict  the  response  to  a  second  generation  of  selection  without  re- 
determining the  heritability.  Experiments  have  shown,  however, 
that  the  response  is  usually  maintained  with  little  change  over  several 
generations — up  to  five,  ten,  or  even  more.  This  will  be  seen  in  the 
graphs  of  responses  to  selection  given  later  in  this  chapter  and  in  the 
next.  In  practice,  therefore,  the  prediction  may  be  expected  to  hold 
good  over  several  generations.  The  effects  of  selection  over  longer 
periods,  and  also  its  effects  on  properties  other  than  the  mean,  will  be 
discussed  in  a  later  section. 

The  selection  differential.  We  have  seen  that  the  change  of  the 
population  mean  brought  about  by  selection — i.e.  the  response — 
depends  on  the  heritability  of  the  character  and  on  the  amount  of 
selection  applied  as  measured  by  the  selection  differential.  The 
selection  differential  will  not  be  known,  however,  until  the  selection 
among  the  parental  generation  has  actually  been  made.  So  the  equa- 
tions of  response  in  the  form  given  above  are  only  of  limited  useful- 
ness for  predicting  the  response.  To  be  able  to  predict  further  ahead 
we  need  to  know  what  determines  the  magnitude  of  the  selection 
differential.  Consideration  of  the  factors  that  influence  the  selection 
differential  will  also  enable  us  to  see  more  clearly  the  means  by  which 
the  breeder  may  improve  the  response  to  selection. 

The  magnitude  of  the  selection  differential  depends  on  two  fac- 
tors: the  proportion  of  the  population  included  among  the  selected 
group,  and  the  phenotypic  standard  deviation  of  the  character.  The 
dependence  of  the  selection  differential  on  these  two  factors  is  illus- 
trated diagrammatically  in  Fig.  11.2.  The  graphs  show  the  distribu- 
tion of  phenotypic  values,  which  is  assumed  to  be  normal.  The 
individuals  with  the  highest  values  are  supposed  to  be  selected,  so 
that  the  distribution  is  sharply  divided  at  a  point  of  truncation,  all 
individuals  above  this  value  being  selected  and  all  below  rejected. 
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The  arrow  in  each  figure  marks  the  mean  value  of  the  selected  group, 
and  S  is  the  selection  differential.  In  graph  (a)  half  the  population  is 
selected,  and  the  selection  differential  is  rather  small:  in  graph  (b) 
only  20  per  cent  of  the  population  is  selected,  and  the  selection  differ- 
ential is  much  larger.   In  graph  (c)  20  per  cent  is  again  selected,  but 


Fig.  i  1.2.  Diagrams  to  show  how  the  selection  differential,  S, 
depends  on  the  proportion  of  the  population  selected,  and  on  the 
variability  of  the  character.  All  the  individuals  in  the  stippled 
areas,  beyond  the  points  of  truncation,  are  selected.  The  axes  are 
marked  in  hypothetical  units  of  measurement. 

(a)  5°%  selected;  standard  deviation  2  units:  S  =  i-6  units 

(b)  20%  selected;  standard  deviation  2  units:  S  =  2-8  units 

(c)  20  %  selected;  standard  deviation  1  unit:  S  =  1  -4  units 

the  character  represented  is  less  variable  and  the  selection  differential 
is  consequently  smaller.  The  standard  deviation  in  (c)  is  half  as  great 
as  in  (b)  and  the  selection  differential  is  also  half  as  great. 

The  standard  deviation,  which  measures  the  variability,  is  a 
property  of  the  character  and  the  population,  and  it  sets  the  units  in 
which  the  response  is  expressed — i.e.  so  many  pounds,  millimetres, 
bristles,  etc.  The  response  to  selection  may  be  generalised  if  both 
response  and  selection  differential  are  expressed  in  terms  of  the 
phenotypic  standard  deviation,  o>.  Then  Rjop  is  a  generalised  mea- 
sure of  the  response,  by  means  of  which  we  can  compare  different 
characters  and  different  populations;  and*S/aP  is  a  generalised  measure 
of  the  selection  differential,  by  means  of  which  we  can  compare 
different  methods  or  procedures  for  carrying  out  the  selection.  The 
' 'standardised"  selection  differential,  SjoP,  will  be  called  the  intensity 
of  selection,  symbolised  by  i.  The  equation  of  response  {n. 2)  then 
becomes 
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By  noting  that  h  =  (ta/gp,  where  vA  is  the  standard  deviation  of  breed- 
ing values  (square  root  of  the  additive  genetic  variance),  we  may  write 
this  equation  in  the  form 

R=ihcrA  (JI>4) 

which  is  sometimes  used  in  comparisons  of  different  methods  of 
selection. 

The  intensity  of  selection,  %  depends  only  on  the  proportion  of 
the  population  included  in  the  selected  group,  and,  provided  the 
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Fig.  i  i  .3 .  Intensity  of  selection  in  relation  to  proportion  selected. 
The  intensity  of  selection  is  the  mean  deviation  of  the  selected 
individuals,  in  units  of  phenotypic  standard  deviations.  The  upper 
graph  refers  to  selection  out  of  a  large  total  number  of  individuals 
measured:  the  lower  two  graphs  refer  to  selection  out  of  totals  of  20 
and  10  individuals  respectively. 
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distribution  of  phenotypic  values  is  normal,  it  can  be  determined 
from  tables  of  the  properties  of  the  normal  distribution.  If  p  is  the 
proportion  selected — i.e.  the  proportion  of  the  population  falling 
beyond  the  point  of  truncation — and  z  is  the  height  of  the  ordinate  at 
the  point  of  truncation,  then  it  follows  from  the  mathematical 
properties  of  the  normal  distribution  that 

S      .    z  ,        x 

Thus,  given  only  the  proportion  selected,  p,  we  can  find  out  by  how 
many  standard  deviations  the  mean  of  the  selected  individuals  will 
exceed  the  mean  of  the  population  before  selection:  that  is  to  say,  the 
intensity  of  selection,  i.  The  graphs  in  Fig.  11.3  show  the  relation- 
ship between  i  and  p\  the  value  of  i  for  any  given  value  of  p  can  be 
read  from  the  graphs  with  sufficient  accuracy  for  most  purposes.  The 
relationship  between  i  and  p  given  in  equation  11. 5  applies,  strictly 
speaking,  only  to  a  large  sample:  that  is  to  say,  when  a  large  number  of 
individuals  have  been  measured,  among  which  the  selection  is  to  be 
made.  When  selection  is  made  out  of  a  small  number  of  measured 
individuals,  the  mean  deviation  of  the  selected  group  is  a  little  less. 
The  intensity  of  selection  can  be  found  from  tables  of  deviations  of 
ranked  data  (Table  XX  of  Fisher  and  Yates,  1943).   The  two  lower 

Table  ii.i 

Intensities  of  selection  when  selection  is  made  out  of  a  small 
number  of  individuals  measured.  The  figures  in  the  table 
are  values  of  i  =Sjop  =  mean  deviation  in  standard  measure. 


Number 

Size 

ofsampl 

e 

selected 

9 

8 

7 

6 

5 

4 

3 

2 

1 

1-49 

1-42 

i-35 

1-27 

1-16 

1-03 

0-85 

0-56 

2. 

I-2I 

1-14 

1-06 

0-96 

0-83 

0-67 

0-42 

— 

3 

I'OO 

0-91 

0-82 

070 

o-55 

o-34 

— 

— 

4 

0-82 

072 

0-62 

0-48 

0-29 

— 

— 

— 

5 

o-66 

o-55 

0-42 

0-25 

— 

— 

— 

— 

6 

0-50 

0-38 

0-23 

— 

— 

— 

— 

— 

7 

o-35 

0-20 

8 

0-19 

curves  in  Fig 

.  11.3 

show  the  intensity 

of  selection  for  samples  of  10 

and  20.    Selection 

intensities  for 

samples  smaller  than  10 

are  given 

in  Table  11.1. 
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Example  11.2.  A  comparison  of  the  expected  and  observed  responses 
under  different  intensities  of  selection  was  made  by  Clayton,  Morris,  and 
Robertson  (1957),  studying  abdominal  bristle  number  in  Drosophila.  The 
heritability  was  first  determined  by  three  methods  which  yielded  a  com- 
bined estimate  of  0-52  (see  Example  10.1).  The  standard  deviation  of 
bristle  number  (average  of  the  two  sexes)  was  3-35.  Selection  at  four 
different  intensities  was  carried  on  for  five  generations,  both  upward  and 
downward  (i.e.  both  for  increased  and  for  decreased  bristle  number).  In 
each  case  20  males  and  20  females  were  selected  as  parents,  the  intensity 
being  varied  by  the  number  out  of  which  these  were  selected,  as  shown  in 
the  first  column  of  the  table.  The  intensities  of  selection  corresponding  to 
these  proportions  selected  may  be  read  off  the  graphs  in  Fig.  11.3.  They 
are  given  in  the  second  column  of  the  table.   The  expected  responses  are 

Mean  response  per  generation 


Proportion 

Intensity  of 

Exp- 

Observed 

selected,  p 

selection,  i 

ected 

Up 

Down 

20/100  =  0-20 

1-40 

2-44 

2*02 

1-48 

20/75  =  0*267 

1-23 

2*14 

2*20 

1-26 

20/50  =  0-40 

0-97 

1-65 

1-46 

0-79 

20/25=0-80 

0'34 

0-59 

0-28 

-0-08 

then  found  from  equation  11.3.  Under  the  most  intense  selection,  for 
example,  it  is  ^  =  1-4x3-35  xo*52  =  2-44.  There  were  five  replicate  lines 
in  both  directions  under  the  most  intense  selection,  and  three  replicates 
under  the  other  intensities.  The  observed  responses  are  quoted  in  the  last 
two  columns  of  the  table.  Although  they  do  not  agree  very  precisely 
with  expectation,  they  show  how  the  change  made  by  selection  falls  off  as 
the  intensity  of  selection  is  reduced,  and  the  data  serve  to  illustrate  the 
computation  of  the  expected  response. 


It  will  now  be  clear  that  there  are  two  methods  open  to  the  breeder 
for  improving  the  rate  of  response  to  selection:  one  by  increasing  the 
heritability  and  the  other  by  reducing  the  proportion  selected  and  so 
increasing  the  intensity  of  selection.  The  heritability  can  be  increased 
only  by  reducing  the  environmental  variation  through  attention  to  the 
technique  of  rearing  and  management.  Reducing  the  proportion 
selected  seems  at  first  sight  to  be  a  straightforward  means  of  improv- 
ing the  response,  but  there  are  several  factors  to  be  considered  which 
set  a  limit  to  what  the  breeder  can  do  in  this  way.  First  is  the  matter 
of  population  size  and  inbreeding.  This  sets  a  lower  limit  to  the 
number  of  individuals  to  be  used  as  parents.  In  experimental  work, 
for  example,  one  might  decide  to  use  not  less  than  10  or  even  20  pairs 
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of  parents;  and  in  livestock  improvement,  particularly  if  artificial 
insemination  came  into  general  use  as  a  means  of  intense  selection  on 
males,  care  would  have  to  be  taken  not  to  restrict  the  number  of 
males  too  much.  For  this  reason  the  intensity  of  selection  can  be 
increased  above  a  certain  point  only  by  increasing  the  total  number  of 
individuals  measured,  out  of  which  the  selection  is  made.  With 
organisms  that  have  a  high  reproductive  rate,  such  as  Drosophila  and 
plants,  very  large  numbers  can,  in  principle,  be  measured;  but  in 
practice  a  limit  is  set  to  the  intensity  of  selection  by  the  time  and 
labour  required  for  the  measurement.  With  organisms  that  have  a 
low  reproductive  rate  the  limit  to  the  intensity  of  selection  is  set  by 
the  reproductive  rate,  since  the  proportion  saved  can  never  be  less 
than  the  proportion  needed  for  replacement;  that  is  to  say,  two 
individuals  are  needed  on  the  average  to  replace  each  pair  of  parents. 
Usually  fewer  males  are  needed  than  females,  because  each  male  can 
mate  with  several  females,  and  so  the  males  leave  more  offspring  than 
the  females.  A  higher  intensity  of  selection  can  then  be  made  on 
males  than  on  females.  Suppose,  for  example,  that  females  leave  on 
the  average  5  offspring,  and  each  male  mates  with  10  females,  so  that 
males  leave  on  the  average  50  offspring.  Then  the  proportion  of 
females  selected  cannot  be  less  than  1/5,  but  only  1/50  of  the  males 
need  be  selected.  The  upper  limits  of  the  intensity  of  selection  in  this 
case  would  be  1-40  for  females,  and  2-64  for  males. 

The  number  of  offspring  produced  by  a  pair  of  parents  depends 
not  only  on  their  reproductive  rate  but  also  on  how  long  the  breeder 
is  willing  to  wait  before  he  makes  the  selection.  This  introduces  a 
new  factor — the  interval  of  time  between  generations — which  we 
have  not  yet  taken  into  account  in  the  treatment  of  the  response  to 
selection,  and  which  we  must  now  consider. 

Generation  interval.  The  progress  per  unit  of  time  is  usually 
more  important  in  practice  than  the  progress  per  generation,  so  the 
interval  between  generations  is  an  important  factor  in  reckoning  the 
response  to  selection.  The  generation  interval  is  the  interval  of  time 
between  corresponding  stages  of  the  life  cycle  in  successive  genera- 
tions, and  it  is  most  conveniently  reckoned  as  the  average  age  of  the 
parents  when  the  offspring  are  born  that  are  destined  to  become 
parents  in  the  next  generation.  By  waiting  until  more  offspring  have 
been  reared  before  he  makes  the  selection  the  breeder  can  increase  the 
intensity  of  selection  and  the  response  per  generation;  but  in  doing  so 
he  inevitably  increases  the  generation   interval   and  may  thereby 
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reduce  the  response  per  unit  of  time.  There  is  thus  a  conflict  of 
interest  between  intensity  of  selection  and  generation  interval,  and 
the  best  compromise  must  be  found  between  the  two.  Increasing  the 
number  of  offspring  will  pay  up  to  a  certain  point,  and  beyond  this 
point  it  will  not.  The  optimal  number  of  offspring  cannot  be  stated 
in  general  terms,  and  each  case  must  be  worked  out  according  to  its 
special  circumstances.  The  procedure  is  explained  in  the  following 
example,  referring  to  mice. 

Example  11.3.  Let  us  suppose  that  selection  is  to  be  applied  to  some 
character  in  mice,  and  that  speed  of  progress  per  unit  of  time  is  the  aim. 
The  question  is:  how  many  litters  should  be  raised?  To  find  the  number  of 
litters  that  will  give  the  maximum  speed  of  progress  we  have  to  find  the 
intensity  of  selection  and  the  generation  interval.  The  ratio  of  the  two  will 
then  give  the  relative  speed.  The  actual  speed  could  be  obtained  by  multi- 
plying by  the  heritability  and  the  standard  deviation,  but  these  factors  will 
be  assumed  to  be  independent  of  the  number  of  litters  raised.  A  comparison 
of  the  expected  rates  of  progress  per  week  is  made  in  the  table.  The  com- 
parison is  made  for  three  different  average  sizes  of  litter,  meaning  the 
number  of  young  reared  per  litter.  It  is  assumed  that  the  character  to  be 
selected  can  be  measured  before  sexual  maturity,  and  that  first  litters  are 
born  when  the  parents  are  9  weeks  old,  subsequent  litters  following  at 
intervals  of  4  weeks.  It  is  assumed  also  that  the  population  is  large  enough 
to  be  treated  as  a  large  sample  in  reckoning  the  intensity  of  selection;  and 
that  equal  numbers  of  males  and  females  are  selected.  The  optimal 
number  of  litters  differs  according  to  the  number  reared  per  litter.    If  6 
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Column  headings 

:  L- 

=  number  of  litters  raised. 

t-- 

=  generation  interval  in  weeks 

P~- 

=  proportion  selected. 

i- 

=  intensity  of  selection. 

i\t-- 

=  relative  speed  of  progress. 

N-- 

=  number  of  young  reared  per 

litter 

young  are  reared  the  maximum  speed  is  attained  by  rearing  only  one 
litter.  If  4  young  are  reared  it  is  worth  while  to  wait  for  second  litters 
before  making  the  selection,  but  not  for  third  litters.  If  only  2  young  are 
reared  per  litter,  raising  three  litters  gives  the  maximum  speed  of  progress. 
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Most  mouse  stocks  are  able  to  rear  6  young  per  litter,  so  under  most  cir- 
cumstances it  is  best  to  make  the  selection  from  the  first  litters,  and  not  to 
wait  for  second  litters.  This  conclusion  could  hardly  have  been  guessed  at 
without  the  computations  shown  in  the  table. 


Measurement  of  Response 

When  one  or  more  generations  of  selection  have  been  made  the 
measurement  of  the  response  actually  obtained  introduces  several 
problems.  These  are  matters  of  procedure  rather  than  of  principle 
and  will  be  only  briefly  discussed. 

Variability  of  generation  means.  The  first  problem  to  be 
solved  arises  from  the  variability  of  generation  means.  Inspection  of 
any  of  the  graphs  of  selection  given  in  the  examples  shows  that  the 
generation  means  do  not  progress  in  a  simple  regular  fashion,  but 
fluctuate  erratically  and  more  or  less  violently.  There  are  two  main 
causes  of  this  variation  between  the  generation  means:  sampling 
variation,  depending  on  the  number  of  individuals  measured;  and 
environmental  change,  which  is  usually  the  more  important  of  the 
two.  The  consequence  of  this  variation  between  generation  means  is 
that  the  response  can  seldom  be  measured  with  any  pretence  of 
accuracy  until  several  generations  of  selection  have  been  made.  The 
best  measure  of  the  average  response  per  generation  is  then  obtained 
from  the  slope  of  a  regression  line  fitted  to  the  generation  means,  the 
assumption  being  made  that  the  true  response  is  constant  over  the 
period.  The  variation  between  generation  means  appears  as  error 
variation  about  the  regression  line,  and  the  standard  error  of  the 
estimate  of  response  is  based  on  it.  Variation  due  to  changes  of 
environment  can,  of  course,  be  overcome,  or  at  least  reduced,  by  the 
use  of  a  control  population.  The  measurement  of  the  response  can, 
however,  be  improved  in  accuracy  if  the  "control"  is  not  an  un- 
selected  population  but  is  selected  in  the  opposite  direction.  This  is 
known  as  a  "two-way"  selection  experiment.  The  response  measured 
from  the  divergence  of  the  two  lines  is  then  about  twice  as  great  as 
that  of  the  lines  separately,  and  the  variation  between  generations  is 
reduced  to  the  extent  that  the  environmental  changes  affect  both  lines 
alike.  An  unselected  control  is,  however,  preferable  if  for  practical 
reasons  one  is  interested  only  in  the  change  in  one  direction,  because 
the  response  is  not  always  equal  in  the  two  directions.  This  point  will 
be  discussed  in  the  next  chapter. 
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Example  i  i  .4.  Fig.  1 1 .4  shows  the  results  of  1 1  generations  of  two-way 
selection  for  body  weight  in  mice  (Falconer,  1953).  On  the  left  the  "up" 
and  "down"  lines  are  shown  separately,  and  on  the  right  the  divergence  be- 
tween the  two  is  shown.  Linear  regression  lines  are  fitted  to  the  observed 


2468  10  2468  10 

GENERATIONS 

Fig.   i  1.4.     Two-way  selection  for  6-week  weight  in  mice.    Ex- 
planation in  Example  11.4.     (Redrawn  from  Falconer,   1953.) 

generation  means.  (The  first  generation  of  selection  is  disregarded  be- 
cause the  method  of  selection  was  different.)  The  estimates  of  the  average 
response  per  generation,  with  their  standard  errors,  are  as  follows: 

Response  ±  standard  error 
in  grams  per  generation. 

Up  0-27  ±  0-050 

Down  0-62  ±  0-046 

Divergence  o-88  ±  0-036 

The  difference  between  the  upward  and  downward  responses  will  be  dis- 
cussed in  the  next  chapter. 

The  foregoing  example  shows  how  the  variation  of  the  generation 
means  can  be  reduced  when  the  response  is  measured  from  the  differ- 
ence betwreen  two  lines,  each  acting  in  the  manner  of  a  control  for  the 
other.  Controls,  however,  are  not  always  available,  and  then  a  more 
serious  difficulty  may  arise  from  progressive  changes  of  environment. 
This  makes  it  difficult  to  assess  the  effectiveness  of  selection  in  the 
improvement  of  domesticated  animals,  and  to  a  lesser  extent  of  plants, 
because  in  the  absence  of  a  control  there  is  no  sure  way  of  deciding 

O  F.Q.G. 
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how  much  of  the  improvement  is  due  to  selection  and  how  much  to  a 
progressive  change  in  the  conditions  of  management. 

Example  11.5.  Lush  (1950)  has  assembled  a  number  of  graphs  show- 
ing the  improvement  of  farm  animals  that  has  taken  place  during  the 
present  century.  Instead  of  reproducing  any  of  these  graphs  we  give  in 
the  table  an  indication  of  the  increase  of  yield  per  individual  over  a  period 
of  years,  as  a  percentage  of  the  initial  yield.  It  is  difficult  to  avoid  the  con- 
clusion that  much  of  the  improvement  of  these  characters  is  the  result  of 
selection,  but  in  the  absence  of  any  standard  of  comparison  it  is  very 
difficult  to  decide  how  much  is  due  to  selection  and  how  much  to  improved 
methods  of  feeding  and  management. 


Character 

Country 

Period 

Improvement,  % 

Cows: 

Milk-yield 

Sweden 

1920- 1944 

21 

Butterfat-yield 

New  Zealand 

1910-1940 

47 

Fat  %  in  milk 

Netherlands 

1906-1945 

22 

Pigs: 

Efficiency  of  growth 

Denmark 

1922-1949 

16 

Body  length 

Denmark 

1926-1949 

5 

Sheep: 

Fleece  weight 

Australia 

1881-1945 

7i 

Hens: 

Egg  production 

U.S.A. 

1909-1950 

64 

Weighting  the  selection  differential.  In  experimental  selection 
the  selection  differential  as  well  as  the  response  has  to  be  measured 
because  it  is  the  relationship  between  the  two,  and  not  the  response 
alone,  that  is  of  interest  from  the  genetic  point  of  view.  We  have  to 
distinguish  between  the  expected  and  the  effective  selection  differ- 
ential, because  in  practice  the  individual  parents  do  not  contribute 
equally  to  the  offspring  generation.  Differences  of  fertility  are  always 
present  so  that  some  parents  contribute  more  offspring  than  others. 
To  obtain  a  measure  of  the  selection  differential  that  is  relevant  to 
the  response  observed  in  the  mean  of  the  offspring  generation  we 
therefore  have  to  weight  the  deviations  of  the  parents  according  to 
the  number  of  their  offspring  that  are  measured.  The  expected 
selection  differential  is  the  simple  mean  phenotypic  deviation  of  the 
parents  as  defined  at  the  beginning  of  this  chapter;  the  effective 
selection  differential  is  the  weighted  mean  deviation  of  the  parents, 
the  weight  given  to  each  parent,  or  pair  of  parents,  being  their  pro- 
portionate contribution  to  the  individuals  that  are  measured  in  the 
next  generation. 

The  weighting  of  the  selection  differential  takes  account  of  a  good 
part  of  the  effects  of  natural  selection.    If  the  differences  of  fertility 
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are  related  to  the  parents'  phenotypic  values  for  the  character  being 
selected,  then  this  natural  selection  will  either  help  or  hinder  the 
artificial  selection.  If,  for  example,  the  more  extreme  phenotypes  are 
less  fertile  or  more  frequently  sterile,  then  natural  selection  is  working 
against  artificial  selection.  By  weighting  the  selection  differential  we 
measure  the  joint  effects  of  natural  and  artificial  selection  together. 
A  comparison  of  the  effective  (i.e.  weighted)  with  the  expected  selec- 
tion differential  may  thus  be  used  to  discover  whether  natural  selec- 
tion is  operative. 

Example  ii.6.  In  an  experiment  with  mice,  selection  for  body  size 
(weight  at  6  weeks)  was  carried  through  30  generations  in  the  upward 
direction  and  24  generations  in  the  downward  direction  (see  Falconer, 
1955).  Comparisons  are  made  in  the  table  between  the  effective  (weighted) 
and  the  expected  (unweighted)  selection  differentials  in  the  two  lines.  The 
period  of  selection  is  divided  into  two  parts  and  the  comparisons  are  made 
separately  in  each.  Throughout  the  whole  of  the  upward  selection  there 
was  virtually  no  difference  between  the  effective  and  expected  selection 
differential,  and  we  can  conclude  that  natural  selection  was  unimportant 
as  a  factor  influencing  the  response.  The  situation  in  the  downward 
selected  line,  however,  is  different,  the  effective  selection  differential  being 
less  than  the  expected,  especially  in  the  second  part.  From  this  we  can 
conclude  that  natural  selection  was  operating  in  favour  of  large  size,  thus 
hindering  the  artificial  selection  and  reducing  the  response  obtained, 
particularly  in  the  latter  part  of  the  experiment.  The  cause  of  the  natural 
selection  and  the  reason  why  it  operated  only  in  the  downward  selected 
line  were  as  follows.  Large  mice  produce  larger  litters  than  small  mice;  but 
for  the  purpose  of  standardisation,  litters  were  artificially  reduced  to  8 
young  at  birth.  At  the  beginning,  and  throughout  the  whole  period  in  the 
upward  selected  line,  there  were  few  litters  with  less  than  8  young,  and  so 


Direction  of 
selection 

Upwards 


Downwards 


Generation 
numbers 

1-22 

23-3° 

1-18 

19-24 


Selection  differential  per 
generation  (gms.) 

Effective 


Expected    Effective 


1*39 
1-08 
1-03 
0-82 


1-36 
1-09 
0-96 

070 


Expected 
0-98 
1 -oi 
0-93 

o-86 


the  differential  fertility  had  no  consequence  in  the  upward  selected  line. 
In  the  downward  selected  line,  however,  there  was  soon  no  standardisation 
because  there  were  few  litters  with  as  many  as  8  young.   Thus  the  smaller 
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mice  produced  fewer  young  and  this  reduced  the  effective  selection  differ- 
ential. In  the  second  part  of  the  experiment  the  smallest  mice  did  not 
breed  at  all  and  this  reduced  the  effective  selection  differential  still  further. 

The  weighting  of  the  selection  differential  does  not  take  account 
of  the  whole  effect  of  natural  selection.  We  noted  at  the  beginning  of 
the  chapter  that  natural  selection  may  operate  at  two  stages,  through 
differences  of  fertility  among  the  parents  and  through  differences  of 
viability  among  the  offspring.  The  effect  of  differences  of  viability 
among  the  offspring  are  not  accounted  for  in  the  effective  selection 
differential.  For  further  examples  and  a  fuller  account  of  the  inter- 
action of  natural  and  artificial  selection  see  Lerner  (1954,  1958). 

Realised  heritability.  The  equation  of  response,  R=h2S  {11.2), 
which  we  discussed  earlier  from  the  point  of  view  of  predicting  the 
response,  can  be  looked  at  the  other  way  round,  as  a  means  of  esti- 
mating the  heritability  from  the  result  of  selection  already  carried 
out,  the  heritability  being  estimated  as  the  ratio  of  response  to  selec- 
tion differential: 

*=§  (n-5) 

The  same  conditions  are  necessary  for  the  valid  use  of  the  equation 
for  estimating  heritability  as  for  predicting  response,  except  that  now 
by  weighting  the  selection  differential  a  good  part  of  the  effects  of 
natural  selection  can  be  taken  account  of.  There  is  also  the  condition 
that  the  observed  response  should  not  be  confounded  with  systematic 
changes  of  generation  mean  due  to  the  environment  or  the  effects  of 
inbreeding.  This,  and  the  absence  of  maternal  effects,  are  the  im- 
portant conditions  for  the  valid  estimation  of  heritability  from  the 
response  to  selection. 

The  ratio  of  response  to  selection  differential,  however,  has  an 
intrinsic  interest  of  its  own,  quite  apart  from  whether  it  provides  a 
valid  estimate  of  the  heritability.  It  provides  the  most  useful  empiri- 
cal description  of  the  effectiveness  of  selection,  which  allows  com- 
parison of  different  experiments  to  be  made  even  when  the  intensity 
of  selection  is  not  the  same.  The  term  realised  heritability  will  be  used 
to  denote  the  ratio  R/S,  irrespective  of  its  validity  as  a  measure  of  the 
true  heritability.  The  realised  heritability  is  estimated  as  follows. 
The  generation  means  are  plotted  against  the  cumulated  selection 
differential.    That  is  to  say,  the  selection  differentials,  appropriately 
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weighted,  are  summed  over  successive  generations  so  as  to  give  the 
total  selection  applied  up  to  the  generation  in  question.  A  regression 
line  is  then  fitted  to  the  points  and  the  slope  of  this  line  measures  the 
average  value  of  R/S,  the  realised  heritability. 

Example  11.7.   Fig.  11.5  shows  the  results  of  21  and  18  generations 
of  two-way  selection  for  6-week  weight  in  mice  (Falconer,  1954  a).    The 
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Fig.  1 1.5.  Two-way  selection  for  6-week  weight  in  mice.  Res- 
ponse plotted  against  cumulated  selection  differential,  as  explained 
in  Example  11.7.  (From  Falconer,  19540;  reproduced  by  courtesy 
of  the  editor  of  the  International  Union  of  Biological  Sciences.) 

generation  means  are  plotted  against  the  cumulated  selection  differential 
and  linear  regression  lines  are  fitted  to  the  points.  The  realised  herit- 
abilities,  estimated  from  the  slopes  of  these  lines,  are: 

Upward  selection:        0-175  ±  0-0161 
Downward  selection:  0-5 1 8  ±  0-023 l 

The  difference  between  the  upward  and  downward  selection  is  referred  to 
in  the  next  chapter. 


Change  of  Gene  Frequency  under  Artificial  Selection 


It  was  pointed  out  at  the  beginning  of  this  chapter  that  the  change 
of  the  population  mean  resulting  from  selection  is  brought  about 
through  changes  of  the  gene  frequencies  at  the  loci  which  influence 
the  character  selected.    But  since  the  effects  of  the  loci  cannot  be 
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individually  identified,  the  changes  of  gene  frequency  cannot  in 
practice  be  followed.  Consequently  the  process  of  selection  for  a 
metric  character  had  to  be  described  in  terms  of  the  selection  differ- 
ential, or  the  intensity  of  selection,  and  of  the  change  of  the  popula- 
tion mean,  representing  the  combined  effects  of  all  the  loci.  This 
leaves  unanswered  the  fundamental  question:  How  great  are  the 
changes  of  gene  frequency  underlying  the  response  of  a  metric 
character  to  selection?  To  answer  this  question,  and  so  to  bridge  the 
gap  between  the  treatment  of  selection  given  in  this  chapter  and  that 
given  earlier  in  Chapter  2,  we  have  to  find  the  connexion  between  the 
intensity  of  selection  (i)  and  the  coefficient  of  selection  (s)  operating 
on  a  particular  locus. 

The  effect  of  selection  for  a  metric  character  on  one  of  the  loci 
concerned  may  best  be  pictured  in  the  manner  illustrated  in  Fig. 
1 1.6.  This  refers  to  a  locus  with  two  alleles  of  which  one  (AT)  is  com- 


Fig.   1 1.6.     Selection  for  a  metric  character  operating  on  one  of 
the  loci  concerned.   The  frequency  of  A2A2  as  depicted  is  q2  =  I. 


pletely  dominant.  With  respect  to  this  locus,  therefore,  the  popula- 
tion is  divided  into  two  portions  which  differ  in  their  mean  pheno- 
typic  values  by  an  amount  2<z,  this  being  the  difference  between  the 
two  homozygotes  in  the  notation  of  earlier  chapters  (see  Fig.  7.1, 
p.  1 1 3).  It  is  assumed  that  the  residual  variance  within  each  portion  is 
the  same,  this  residual  variance  arising  from  all  the  other  loci  as  well 


Chap.  II] 


CHANGE  OF  GENE  FREQUENCY 


205 


as  from  environmental  causes.  The  proportion  of  individuals  in  the 
two  portions  depends  on  the  gene  frequency  at  the  locus,  q2  being  in 
the  portion  consisting  of  A2A2  genotypes,  and  i  -q2  in  the  portion 
containing  AXAX  and  AXA2  genotypes.  When  artificial  selection  is 
applied,  a  proportion  of  the  whole  population  lying  beyond  the  point 
of  truncation  is  cut  off,  and  the  proportion  of  A2A2  genotypes  is  lower 
among  this  selected  group  than  in  the  population  as  a  whole,  selec- 
tion acting  in  the  case  illustrated  against  the  A2  allele.  Now,  the  new 
gene  frequency,  ql9  is  the  frequency  of  A2  genes  among  the  selected 
group  of  individuals.  This  may  be  found  by  deducing  the  regression 
of  gene  frequency  on  phenotypic  value,  bqP.  The  selected  group 
deviates  in  mean  phenotypic  value  from  the  population  mean  by  an 
amount  £,  which  is  the  selection  differential.  The  gene  frequency 
among  the  selected  group  will  then  be  given  by  the  regression  equa- 
tion 

qi=q+bqPS  (11.6) 

The  regression  of  gene  frequency  on  phenotypic  value  is  found  as 
follows.    The  three  genotypes  are  listed  in  Table  11.2  with  their 


Table  11.2 

q      G 


AiA2 

A2A2 


p2 
zpq 


frequencies  in  the  whole  population.  The  third  column  of  the  table 
gives  the  frequency  of  the  A2  allele  among  each  of  the  three  geno- 
types, which  is  simply  o,  J,  and  1 .  The  last  column  gives  the  geno- 
typic  values.  Provided  there  is  no  correlation  between  genotype  and 
environment,  these  are  also  the  mean  phenotypic  values  of  each 
genotype.  There  is  now  no  assumption  of  complete  dominance. 
The  covariance  of  gene  frequency  with  phenotypic  value  is  obtained 
from  the  sum  of  the  products  of  q  and  P,  each  multiplied  by  the 
frequency  of  the  genotype.  From  this  sum  of  products  must  be 
deducted  the  product  of  the  means  of  the  gene  frequency  and  the 
phenotypic  value.  Thus  the  covariance  is  covqP=pqd-q2a-qM, 
where  M  is  the  population  mean.  Substituting  the  value  of  M  from 
equation  7.2,  the  covariance  reduces  to  -  pq[a  +  d(q  -  p)]  —  -  pqa, 
where  a  is  the  average  effect  of  the  gene  substitution  (see  equation  J.5). 
The  regression  of  gene  frequency  on  phenotypic  value  is  therefore 
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where  oP  is  the  phenotypic  variance. 

Next,  we  substitute  this  regression  coefficient  in  equation  u.6, 
putting  also  S  =  ioP  from  equation  11.5.  This  gives  the  gene  frequency 
among  the  selected  parents  as 

Gp 

and  the  change  of  gene  frequency  resulting  from  the  selection 
reduces  to 

Aq=  -ipq—  (11.8) 

dp 

The  change  is  negative  because  selection  is  acting  against  the  allele 
A2  whose  frequency  is  q.  This  formula  enables  us  to  translate  the 
intensity  of  selection,  i,  into  the  coefficient  of  selection,  s,  against  A2, 
because  equations  for  the  change  of  gene  frequency  in  terms  of  s  were 
given  in  Chapter  2.  We  shall  take  the  approximate  equations  given 
in  2.7  and  2.8.  If  dominance  is  complete,  d=a  and  a  =  2qa.  Then 
equating  1 1.8  with  2.8  gives 

ipq^-=sq\i-q). 

Gp 

If  there  is  no  dominance  d=o  and  <x  =  a.  Then  equating  11. 8  with 
2.7  gives 

ipq^-  =  isq(i-q) 

Gp 

Both  these  equations,  on  simplification,  reduce  to 

.2a 
s~i—  (JJ-9) 

Gp 

Thus  we  find  that  the  two  ways  of  expressing  the  "force"  of  selection 
— by  the  intensity  and  the  coefficient  of  selection — are  very  simply 
related  to  each  other.  The  coefficient  of  selection  operating  on  any 
locus  is  directly  proportional  to  the  intensity  of  selection  and  to  the 
quantity  zajop.  This  quantity  is  the  difference  of  value  between  the  two 
homozygotes  expressed  in  terms  of  the  phenotypic  standard  deviation. 
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For  want  of  a  more  suitable  term  we  shall  refer  to  this,  rather  loosely, 
as  the  "proportionate  effect"  of  the  locus.  There  is  nothing  more 
that  we  can  do  with  the  relationship  expressed  in  equation  n.g  at 
the  moment,  but  we  shall  use  it  in  the  next  chapter  to  draw  some 
tentative  conclusions  about  the  "proportionate  effects"  of  loci  con- 
cerned with  metric  characters. 


CHAPTER    12 


SELECTION: 

II.  The  Results  of  Experiments 

In  the  last  chapter  we  saw  that  the  theoretical  deductions  about  the 
effects  of  artificial  selection  are  limited  to  the  change  of  the  popula- 
tion mean,  and  strictly  speaking  over  only  one  generation.  By  chang- 
ing the  gene  frequencies  selection  changes  the  genetic  properties  of 
the  population  upon  which  the  effects  of  further  selection  depend. 
And,  because  the  effects  of  the  individual  loci  are  unknown,  the 
changes  of  gene  frequency  cannot  be  predicted,  and  so  the  response 
to  selection  can  be  predicted  only  for  as  long  as  the  genetic  properties 
remain  substantially  unchanged.  Thus  there  are  many  consequences 
of  selection  that  can  be  discovered  only  by  experiment.  The  object  of 
this  chapter  is  to  describe  briefly  what  seem  to  be  the  most  general 
conclusions  about  these  consequences  that  have  emerged  from 
experimental  studies  of  selection.  It  should  be  noted,  however,  that 
the  drawing  of  conclusions  from  the  results  of  experiments  in  the 
field  of  quantitative  genetics  is  to  some  extent  a  matter  of  personal 
judgement.  Many  of  the  conclusions  put  forward  in  this  chapter 
therefore  represent  a  personal  viewpoint,  and  are  not  necessarily 
accepted  generally.  The  most  important  questions  to  be  answered 
by  experiment  concern  the  long-term  effects  of  selection.  For  how 
long  does  the  response  continue?  By  how  much  can  the  population 
mean  ultimately  be  changed  ?  What  is  the  genetic  nature  of  the  limit 
to  further  progress?  These  questions  will  be  dealt  with  in  the  latter 
part  of  the  chapter.  First  we  shall  consider  two  questions  raised  by 
the  examples  in  the  last  chapter. 


Repeatability  of  Response 

In  Example  ii.i  we  saw  that  the  response  in  one  generation  of 
selection  was  very  variable  when  the  selection  was  replicated  in  a 
number  of  lines.    Though  the  average  response  agreed  fairly  well 
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with  the  prediction,  the  responses  of  the  individual  lines  did  not. 
This  raises  the  question:  How  consistent,  or  repeatable,  are  the 
results  of  selection  ?  If  selection  is  applied  to  different  samples  drawn 
from  the  same  population,  how  closely  will  the  results  agree?  Part 
of  the  problem  here  concerns  sampling  variation — the  extent  to 
which  the  samples  differ  in  gene  frequencies,  both  initially  and 
during  the  course  of  the  continued  selection.  This  depends,  of  course, 
on  the  size  of  the  populations,  or  lines,  during  the  course  of  the 
selection;  but  it  depends  also  on  the  initial  gene  frequencies  in  the 
base  population  from  which  the  samples  were  drawn.  If  most  of  the 
loci  concerned  with  the  character  have  genes  at  more  or  less  inter- 
mediate frequencies  then  the  response  to  selection  is  not  likely  to  be 
much  influenced  by  sampling  variation.  On  the  other  hand,  if  there 
are  loci  with  genes  at  low  frequency  then  these  will  be  included  in 
some  samples  drawn  from  the  initial  population  but  will  be  absent 
from  others.  Then,  if  any  of  these  low-frequency  genes  have  a  fairly 
large  effect  on  the  character  their  presence  or  absence  may  appreciably 
influence  the  outcome  of  selection.  The  experiment  on  abdominal 
bristle-number  in  Drosophila  whose  first  generation  was  quoted  in 
Example  ii.i,  provides  the  only  evidence  on  this  point  (Clayton, 
Morris,  and  Robertson,  1957).  Fig.  12. 1  shows  the  responses  in  the 
five  up  and  the  five  down  lines  over  20  generations.  The  responses 
are  reasonably  consistent  over  the  first  5  generations  in  the  up  lines 
and  over  about  10  generations  in  the  down  lines.  Thereafter  the 
lines  begin  to  differentiate,  and  by  the  twentieth  generation  there  are 
substantial  differences  between  them.  The  conclusion  suggested  by 
the  early  similarity  and  the  later  divergence  between  the  replicate 
lines  is  that  the  early  response  is  governed  chiefly  by  genes  at  more  or 
less  intermediate  frequencies,  but  in  the  later  stages  genes  at  initially 
low  frequencies  begin  to  come  into  play,  the  initial  sampling  having 
caused  differences  between  the  lines  in  respect  of  these  genes. 

The  question  of  repeatability  of  the  response  to  selection  may  be 
extended  to  differences  between  populations.  This  is  not  a  matter  of 
sampling  variation  but  of  the  differences  in  the  genetic  properties  of 
populations.  We  noted  in  Chapter  10  that  heritabilities  frequently 
differ  between  populations,  and  consequently  we  should  not  expect 
the  responses  to  selection  to  be  the  same.  It  is  of  interest  nevertheless 
to  compare  the  results  of  selection  applied  to  different  populations 
and  to  see  how  they  do  actually  differ.  Fig.  12.2  shows  the  results  of 
selection  for  thorax  length  in  Drosophila  melanogaster  applied  to  three 
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Fig.  12.1.  Selection  for  abdominal  bristle  number  in  Drosophila 
melanogaster,  replicated  in  5  lines  in  each  direction.  The  broken 
lines  refer  to  suspended  selection  and  the  thin  continuous  lines  to 
inbreeding  without  selection.  (From  Clayton,  Morris,  and  Robert- 
son, 1957;  reproduced  by  courtesy  of  the  authors  and  the  editor  of 
the  Journal  of  Genetics.) 
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Fig.  12.2.  Selection  for  thorax  length  in  Drosophila  melanogaster 
from  three  different  base  populations.  The  broken  lines  refer  to 
reversed  selection  and  the  dotted  lines  to  suspended  selection. 
(From  F.  W.  Robertson,  1955;  reproduced  by  courtesy  of  the 
author  and  the  editor  of  the  Cold  Spring  Harbor  Symposia  on 
Quantitative  Biology.) 
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different  wild  populations,  (F.  W.  Robertson,  1955).  The  responses 
of  the  three  populations,  both  upward  and  downward,  are  fairly  alike. 
It  is  not  possible  to  discuss  further  the  degree  of  repeatability 
between  the  responses  found  in  these  two  experiments,  because  there 
is  no  objective  criterion  for  deciding  how  closely  the  responses  ought 
to  agree.  One  can  therefore  only  regard  them  as  empirical  evidence 
of  what  in  practice  does  occur. 


Asymmetry  of  Response 

A  surprising  feature  of  the  experimental  results  illustrated  in  the 
last  chapter  is  the  inequality  of  the  responses  to  selection  in  opposite 
directions,  seen  particularly  well  in  Fig.  11.5.  This  asymmetry  of 
response  has  been  found  in  many  two-way  selection  experiments, 
but  its  cause  is  not  yet  known.  For  this  reason  we  shall  not  discuss 
the  phenomenon  in  detail,  but  shall  merely  note  the  possible  causes, 
of  which  there  are  several.  These  possible  causes  are,  briefly,  as 
follows. 

1.  Selection  differential.  The  selection  differential  may  differ 
between  the  upward  and  downward  selected  lines,  for  several  reasons, 
(i)  Natural  selection  may  aid  artificial  selection  in  one  direction  or 
hinder  it  in  the  other,  (ii)  The  fertility  may  change  so  that  a  higher 
intensity  of  selection  is  achieved  in  one  direction  than  in  the  other, 
(iii)  The  variance  may  change  as  a  result  of  the  change  of  mean:  the 
selection  differential  will  increase  as  the  variance  increases  and  de- 
crease as  it  decreases.  This  is  a  "scale- effect,"  to  be  discussed  more 
fully  in  Chapter  17.  These  three  causes  operating  through  the  selec- 
tion differential  were  all  found  in  the  experiment  with  mice  cited  in 
the  last  chapter,  but  they  operated  in  the  direction  opposite  to  that  of 
the  asymmetry  found.  The  selection  differential  was  greater  in  the 
upward  selection  but  the  response  was  greater  in  the  downward 
selection.  Differences  of  the  selection  differential  influence  the 
response  per  generation,  but  they  affect  the  realised  heritability  only  a 
little.  Therefore  if  the  response  is  plotted  against  the  cumulated 
selection  differential  and  there  is  still  much  asymmetry,  as  in  Fig. 
1 1.5,  it  cannot  be  attributed  to  any  cause  operating  through  the 
selection  differential. 

2.  "Genetic  asymmetry."  There  are  two  sorts  of  asymmetry 
in  the  genetic  properties  of  the  initial  population  that  could  give  rise 
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to  asymmetry  of  the  responses  to  selection  (Falconer,  1954a).  These 
concern  the  dominance  and  the  gene  frequencies  of  the  loci  concerned 
with  the  character.  The  dominant  alleles  at  each  locus  may  be  mostly 
those  that  affect  the  character  in  one  direction,  instead  of  being  more 
or  less  equally  distributed  between  those  that  increase  and  those  that 
decrease  it.  We  shall  refer  to  this  situation  as  directional  dominance. 
If  the  initial  gene  frequencies  were  about  0-5,  the  response  would  be 
expected  to  be  greater  in  the  direction  in  which  the  alleles  tend  to  be 
recessive.  It  will  be  shown  in  Chapter  14  that  this  is  also  the  direction 
in  which  the  mean  is  expected  to  change  on  inbreeding.  Therefore 
we  should,  in  general,  expect  characters  that  show  inbreeding 
depression  to  respond  more  rapidly  to  downward  selection  than  to 
upward  selection.  There  may  also  be  asymmetry  in  the  distribution  of 
gene  frequencies.  The  more  frequent  alleles  at  each  locus  may  be 
mostly  those  that  affect  the  character  in  one  direction — a  situation 
that  we  shall  refer  to  as  directional  gene  frequencies.  In  the  absence  of 
directional  dominance  this  would  be  expected  to  cause  a  more  rapid 
response  to  selection  in  the  direction  of  the  less  frequent  alleles. 
Under  natural  selection  the  less  favourable  alleles,  in  respect  to  fit- 
ness, will  have  been  brought  to  lower  frequencies.  Therefore  if 
selection  in  one  direction  reduces  fitness  more  than  selection  in  the 
other,  we  should  expect  a  more  rapid  response  in  the  direction  of  the 
greater  loss  of  fitness.  The  asymmetry  of  the  response  to  selection 
theoretically  expected  from  these  two  causes  may  be  seen  by  con- 
sideration of  Fig.  2.3,  which  shows  the  expected  response  arising 
from  one  locus.  Neither  of  these  two  causes — directional  dominance 
and  directional  gene  frequencies — would,  however,  be  expected  to 
give  rise  to  immediate  asymmetry;  that  is,  in  the  first  few  generations 
of  selection.  The  asymmetry  would  appear  only  as  the  gene  fre- 
quencies in  the  upward  and  downward  selected  lines  become  differ- 
entiated. The  asymmetry  found  in  some  experiments  undoubtedly 
appears  sooner  than  would  be  expected  from  these  causes. 

3.  Selection  for  heterozygotes.  If  selection  in  one  direction 
favours  heterozygotes  at  many  loci,  or  at  a  few  loci  with  important 
effects,  the  response  would  become  slow  as  the  gene  frequencies  ap- 
proach their  equilibrium  values.  But  the  response  in  the  other  direc- 
tion would  be  rapid  until  the  favoured  alleles  approach  fixation.  This 
situation,  which  is  a  form  of  directional  dominance,  would  also  be 
expected  to  give  rise  to  an  asymmetrical  response  (Lerner,  1954);  but, 
again,  not  immediately. 
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4.  Inbreeding  depression.  Most  experiments  on  selection  are 
made  with  populations  not  very  large  in  size,  and  there  is  usually 
therefore  an  appreciable  amount  of  inbreeding  during  the  progress 
the  selection.  If  the  character  selected  is  one  subject  to  inbreeding 
depression,  there  will  be  a  tendency  for  the  mean  to  decline  through 
inbreeding.  This  will  reduce  the  rate  of  response  in  the  upward 
direction  and  increase  it  in  the  downward  direction,  thus  giving  rise 
to  asymmetry.  An  unselected  control  population  will  reveal  how 
much  asymmetry  can  be  attributed  to  this  cause.  Inbreeding 
depression  has  been  shown  to  be  an  insufficient  cause  of  the  asymmetry 
in  the  experiments  cited  in  the  last  chapter. 

5.  Maternal  effects.  Characters  complicated  by  a  maternal  effect 
may  show  an  asymmetry  of  response  associated  with  the  maternal 
component  of  the  character.  The  situation  envisaged  may  best 
be  explained  by  reference  to  the  selection  for  body  weight  in 
mice  (Falconer,  1955),  which  showed  the  strong  asymmetry  illus- 
trated in  Fig.  1 1.5.  The  character  selected — 6- week  weight — may  be 
divided  into  two  components,  weaning  weight  and  post-weaning 
growth,  the  former  being  maternally  determined.  It  was  found  that 
all  the  asymmetry  resided  in  the  weaning  weight  and  none  in  the 
post-weaning  growth.  The  weaning  weight  increased  hardly  at  all 
in  the  large  line  but  decreased  very  much  in  the  small  line.  Thus  it 
was  the  mothering  ability  that  changed  asymmetrically  under  selec- 
tion and  not  the  growth  of  the  young  themselves.  To  attribute  an 
asymmetrical  response  to  maternal  effects  does  not,  however,  solve 
the  problem,  because  the  asymmetry  has  merely  been  shifted  from 
the  character  selected  to  another,  and  is  still  just  as  much  in  need  of 
an  explanation. 

These,  then,  are  the  possible  causes  of  asymmetry  that  may  be 
suggested.  There  are  probably  others.  Until  the  causes  of  asym- 
metry are  better  understood  it  is  clear  that  predictions  of  the  rate  of 
response  to  selection  must  be  made  with  caution.  Where  there  is 
asymmetry  of  response  the  mean  of  the  realised  heritabilities  in  the 
two  directions  will  presumably  correspond  with  the  heritability 
estimated  from  the  resemblance  between  relatives.  Therefore  the 
response  predicted  will  presumably  be  about  the  mean  of  the  two- 
way  responses  actually  obtained.  If  the  asymmetry  found  in  the 
mouse  experiment  should  prove  to  be  characteristic  of  selection  for 
economically  desirable  characters  in  mammals,  it  means  that  we  must 
expect  actual  progress  to  fall  short  of  the  predicted  progress.   In  this 
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experiment  the  mean  realised  heritability  was  35  per  cent,  but  the 
upward  progress  was  only  at  the  rate  of  18  per  cent.  In  other  words 
the  progress  made  was  only  about  half  as  rapid  as  would,  presumably, 
have  been  predicted. 


Long-term  Results  of  Selection 

The  response  to  selection  cannot  be  expected  to  continue  in- 
definitely. Sooner  or  later  it  is  to  be  expected  that  all  the  favourable 
alleles  originally  segregating  will  be  brought  to  fixation.  As  they 
approach  fixation  the  genetic  variance  should  decline  and  the  rate  of 
response  diminish,  till,  when  fixation  is  complete,  the  response  should 
cease.  The  population  should  then  fail  also  to  respond  to  selection  in 
the  opposite  direction,  and  further  response  to  selection  in  either 
direction  will  depend  on  the  origin  of  new  genetic  variation  by 
mutation.  But  how  many  generations  must  elapse  before  the  response 
ceases,  and  how  great  will  be  the  total  response  are  questions  that  can 
be  answered  only  by  experiment.  Let  us  first  see  what  evidence  is 
available  on  these  points,  and  then  see  how  far  the  long-term  effects 
of  selection  conform  to  the  simple  theoretical  picture  outlined  above. 

Total  response  and  duration  of  response.  When  the  response 
to  selection  has  ceased,  the  population  is  said  to  be  at  the  selection 
limit.  It  is  usually  impossible  to  decide  exactly  at  what  point  the 
limit  is  reached,  because  the  limit  is  approached  gradually,  the  res- 
ponse becoming  progressively  slower.  The  total  response,  and  par- 
ticularly the  duration  of  the  response,  can  therefore  be  estimated  only 
approximately.  Bearing  this  in  mind,  we  may  examine  the  results  of 
four  two-way  selection  experiments,  two  with  Drosophila  and  two 
with  mice,  given  in  Table  12.1.  The  asymmetry  of  the  responses  is 
disregarded,  and  the  total  response  is  taken  as  the  sum  of  the  total 
responses  in  the  two  directions.  This  is  the  difference  between  the 
upper  and  lower  selection  limits,  and  may  be  called  the  total  range.  In 
the  table  the  total  range  is  expressed  in  three  ways:  as  a  percentage  of 
the  initial  population  mean,  M0]  in  terms  of  the  phenotypic  standard 
deviation,  aP,  in  the  initial  population;  and  in  terms  of  the  standard 
deviation  of  breeding  values,  crAi  (i.e.  the  square  root  of  the  additive 
variance)  in  the  initial  population.  To  draw  general  conclusions  from 
these  four  experiments  would  be  rash,  because  the  experiments 
differed  in  several  ways — in  the  intensity  of  selection,  the  population 
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size,  and  the  nature  of  the  initial  population — all  of  which  would  be 
expected  to  affect  the  duration  of  response  and  the  total  range. 
Despite  these  differences,  however,  the  picture  they  give  is  fairly- 
consistent.    The  response  continues  for  about  20  to  30  generations; 

Table  12.1 
Total  Responses  in  four  Selection  Experiments 

Experiment  Duration  Total  range 

(generations)      /M0  (%)      jaP      jaA 
Drosophila: 

(1)  abdominal  bristles  30  189  20         28 

(2)  thorax  length  20  24  12         22 


Mice: 

(3)       6-week  weight 

25 

69 

8 

16 

(4)       60-day  weight 

20 

122 

10 

21 

References: 

(1)  Clayton  and  Robertson  (1957). 

(2)  F.  W.  Robertson  (1955). 

(3)  Falconer  (1955). 

(4)  MacArthur  (1949);  Butler  (1952). 

and  the  total  range  is  between  15  and  30  times  the  square  root  of  the 
additive  variance,  or  about  10  to  20  times  the  phenotypic  standard 
deviation  in  the  initial  population.  The  relationship  between  the 
total  range  and  the  original  population  mean,  however,  is  quite 
irregular. 

The  total  response  produced  by  selection  in  these  experiments, 
though  it  may  be  impressive  when  reckoned  in  terms  of  the  variation 
present  in  the  original  population,  is  not  at  all  spectacular  when  com- 
pared with  the  achievements  of  the  breeders  of  domestic  animals. 
For  example,  the  upper  limits  of  body  weight  of  the  mice  in  the 
experiments  quoted  are  2  to  3  times  the  lower  limits;  but  the  weights 
of  the  largest  breeds  of  dog  are  about  75  times  greater  than  those  of 
the  smallest  (Sierts-Roth,  1953).  The  reason  for  the  disappointing 
results  of  experimental  selection  when  viewed  against  the  differences 
between  the  breeds  of  domestic  animals  is  that  experiments  are 
carried  out  with  closed  populations  of  not  very  large  size.  The  limits 
are  set  by  the  gene  content  of  the  foundation  individuals,  since  no 
genes  are  brought  in  after  selection  has  been  started.  The  breeder  of 
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domestic  animals,  in  contrast,  by  intermittent  crossing  casts  his  net 
far  wider  in  the  search  for  genes  favourable  to  his  purposes. 

The  effects  of  inbreeding  during  the  selection  have  been  ignored 
in  this  account  of  selection  limits.  It  is  clear  on  theoretical  grounds 
that  inbreeding  will  tend  to  cause  fixation  of  unfavourable  alleles  at 
some  loci.  Both  the  total  response  and  the  duration  of  the  response 
must  therefore  be  expected  to  be  reduced  if  the  selection  is  carried 
out  in  a  small  population  with  a  fairly  high  rate  of  inbreeding.  There 
is,  however,  little  experimental  evidence  on  the  magnitude  of  this 
effect  of  inbreeding.  The  four  experiments  discussed  above  were  all 
carried  out  on  fairly  large  populations,  so  that  the  rate  of  inbreeding 
was  fairly  low. 

Number  of  "loci."  When  the  total  range  has  been  determined 
by  experiment  it  is  possible,  in  principle,  to  deduce  the  number  of 
loci  that  gave  rise  to  the  response,  and  the  magnitude  of  their  effects. 
The  estimates  that  can  be  made  in  practice,  however,  are  only  rough 
ones,  because  the  properties  of  the  individual  loci  are  unknown  and 
have  to  be  guessed  at.  But  even  though  we  can  do  no  more  than 
establish  the  order  of  magnitude  of  the  number  and  effects  of  the  loci, 
this  is  better  than  no  estimate  at  all;  so  let  us  see  how  these  estimates 
may  be  obtained.  The  limitations  will  become  apparent  as  we  pro- 
ceed. 

The  estimates  come  from  a  comparison  of  the  total  range  with 
the  amount  of  additive  genetic  variance  in  the  original  population.  In 
principle  it  is  clear  that  with  a  given  amount  of  initial  variation  a 
small  number  of  genes  will  produce  less  total  response  than  a  larger 
number;  and  that  if  a  given  amount  of  variation  is  produced  by  few 
genes  the  magnitude  of  their  effects  must  be  greater  than  if  it  is  pro- 
duced by  many.  It  is  clear,  also,  that  linkage  is  an  important  factor 
in  the  relationship  between  variance  and  total  response.  Some  seg- 
ments of  chromosome  that  segregate  as  units  in  the  initial  popula- 
tion will  recombine  during  the  selection  and  appear  as  many  genes 
contributing  to  the  total  response.  Other  segments  may  fail  to  re- 
combine  and  will  be  counted  as  single  genes.  In  order  to  emphasise 
this  limitation,  the  estimate  of  the  number  of  loci  may  be  referred  to 
as  the  number  of  "effective  factors"  or  as  the  "segregation  index." 
There  are,  however,  other  uncertainties,  and  we  shall  simply  refer  to 
it  as  the  number  of  "loci,"  letting  the  inverted  commas  serve  to 
remind  us  of  the  unavoidable  limitations  and  qualifications. 

We  must  first  suppose  that  there  has  been  no  inbreeding  and  when 
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i     R2  — N 
n  =  o-^  •  (J2-5) 

This  equation  gives  the  basis  for  estimating  the  number  of  "loci." 
Their  effects  may  then  be  estimated  from  equation  12.4.  The  most 
meaningful  measure  of  the  "effect"  of  a  locus,  however,  is  what  we 
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the  selection  limits  have  been  reached  all  loci  are  fixed  for  the  favour- 
able allele.  The  total  range  is  then  zUa,  where  za  is  the  difference  of 
genotypic  value  between  the  two  most  extreme  homozygotes  at  a 
particular  locus,  and  is  the  precise  meaning  of  what  we  have  loosely 
called  the  "effect"  of  the  locus.  If  R  is  the  total  range  and  n  is  the 
number  of  loci  that  have  contributed  to  the  response,  then 

R  =  2tia  i12-1) 

where  a  is  the  mean  value  of  a.  Next  we  must  suppose  that  each  locus 
has  only  two  alleles.  The  additive  variance  arising  from  one  locus  is 
then  o-jj  =2pq[a  +  d(q-p)Y,  from  equation  £.5.  (We  shall  use  a2  here 
to  denote  variance  instead  of  V,  because  it  simplifies  the  formulation 
when  standard  deviations  are  involved.)  The  gene  frequencies  at  the 
individual  loci  thus  enter  the  picture.  Unless  the  initial  population 
was  made  from  crosses  between  inbred  lines,  the  gene  frequencies 
are  not  known  and  we  shall  therefore  have  to  insert  hypothetical 
values.  We  shall  suppose  that  all  segregating  genes  are  at  frequencies 
of  0-5,  as  they  would  be  if  the  initial  population  were  made  from  a 
cross  between  two  inbred  lines.  The  additive  variance  contributed  by 
one  locus  then  becomes  a\  =  |<22,  and  the  degree  of  dominance  be- 
comes irrelevant.  Next  we  have  to  suppose  there  is  no  linkage  be- 
tween the  loci,  so  that  the  additive  variance  due  to  all  n  loci  together  is 

fliHW?)  .V.. ..(12.2) 

where  (a2)  is  the  mean  of  the  squares  of  a  for  each  locus.  Finally  we 
shall  suppose  that  all  loci  have  equal  effects,  so  that  equations  12.1 
and  12.2  become 

R  =  2tia  (I2>3) 

and 

<y\  =  lna2  (12.4) 

Squaring  equation  12.3  and  substituting  a2  =  (2/n)(j^  from  equation 
12.4  gives  R2  =  8/zo-l,  whence 
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have  earlier  called  the  *  'proportionate  effect,"  2<z/crP,  which  is  the 
difference  between  the  homozygotes  expressed  in  terms  of  the  pheno- 
typic  standard  deviation.  By  rearrangement  of  equation  12.4  this 
becomes 


gp        \]  \n) 


(12.6) 


where  h  is  the  square  root  of  the  heritability. 

Let  us  see  what  results  these  theoretical  deductions  yield  when 
applied  to  the  experiments  quoted  in  Table  12.1.  The  estimates  of 
the  number  of  "loci"  and  of  the  proportionate  effects  of  the  genes  are 


Table 

12.2 

Experiment 

Number 
"loci" 

of 

Proportionate 
effect  (za/ap) 

Drosophila : 

(1)       abdominal  bristles 

99 

0-21 

(2)       thorax  length 

59  , 

0-20 

Mice: 

(3)       6-week  weight 

35 

0-23 

(4)       60-day  weight 

53 

0-19 

(For  references  to  experiments  see  Table  12. 1) 

given  in  Table  12.2.  Since  the  estimation  of  the  number  of  "loci"  is 
necessarily  so  imprecise  it  does  not  seem  worth  while  to  discuss  in 
detail  its  limitations  or  the  errors  that  may  have  been  introduced  by 
the  assumptions  that  were  made.  These  matters  are  discussed  by 
Wright  ( 1 9526).  The  results  given  in  Table  12.2,  then,  suggest  that 
the  responses  to  selection  in  these  experiments  have  resulted  from 
about  100  loci  (i.e.  more  nearly  100  than  10  or  1,000);  and  that  on  the 
average  the  difference  in  value  between  homozygotes  at  one  locus 
amounts  to  about  one-fifth  of  the  phenotypic  standard  deviation. 

Nature  of  the  selection  limit.  The  deductions  made  in  the  last 
section  from  the  observed  total  response  were  based  on  the  assump- 
tion that  the  selection  limit  represents  fixation  of  all  favourable 
alleles.  The  simple  theoretical  expectation  is  that  selection  should 
lead  to  fixation  with  the  consequent  loss  of  genetic  variance.  Let  us 
now  consider  the  evidence  from  experiments  about  the  nature  of  the 
selection  limit  and  see  how  far  it  conforms  to  this  simple  theoretical 
picture.    If  the  genetic  variance  declines  as  the  limit  is  approached 
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this  ought  to  be  apparent  in  a  decline  of  phenotypic  variance.  In 
many  experiments,  however,  the  phenotypic  variance  has  been  found 
not  to  decline,  even  when  the  selection  limit  has  been  reached,  and 
when  due  allowance  for  "scale  effects"  has  been  made  as  will  be 
explained  in  Chapter  17.  A  fairly  typical  example  is  provided  by  the 
experiment  with  mice  which  was  described  in  the  last  chapter  (Fig. 
1 1.5).  The  phenotypic  variance  is  shown  in  Fig.  12.3,  expressed  in 
the  form  of  the  coefficient  of  variation  in  order  to  eliminate  scale 


GENERATIONS 

Fig.  12.3.  Coefficient  of  variation  of  6 -week  weight  in  mice.  The 
thin  continuous  line  starting  at  generation  23  refers  to  the  un- 
selected  control.  The  broken  lines  refer  to  reversed  selection  and 
the  dotted  lines  to  suspended  selection.  (From  Falconer,  1955; 
reproduced  by  courtesy  of  the  editor  of  the  Cold  Spring  Harbor 
Symposia  on  Quantitative  Biology.) 

effects.  The  variance  in  the  large  line  remains  at  the  same  level 
throughout  the  experiment,  and  after  the  limit  has  been  reached  at 
about  the  twenty-fifth  generation  a  comparison  with  the  unselected 
control  shows  the  variance  not  to  have  declined  at  all.  The  variance 
in  the  small  line  shows  a  sudden  and  large  increase,  but  we  shall 
return  to  this  point  later.  An  example  from  Drosophila  is  provided 
by  the  experiment  on  abdominal  bristle-number  illustrated  in  Fig. 
12. 1 .  The  phenotypic  variance  in  the  base  population  and  in  the  most 
extreme  of  the  high  and  of  the  low  lines  after  35  and  34  generations 
respectively  is  illustrated  by  frequency  distributions  in  Fig.  12.4.  In 
this  case  the  variance  not  only  failed  to  decline  but  increased  very 
much  during  the  selection  in  both  directions.  Before  we  consider  the 
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reasons  for  this  behaviour  of  the  variance  we  shall  mention  another 
fact  often  found  in  selection  experiments.  It  is  that  when  the  response 
to  continued  selection  has  ceased  the  population  will  often  respond 
to  selection  in  the  reverse  direction  and  will  often  respond  rapidly. 
This  is  well  illustrated  in  Fig.  12.2,  where  the  three  lines  selected  for 


Fig.  12.4.  Frequency  distributions  of  abdominal  bristle  number 
in  Drosophila  melanogaster  (females),  in  the  base  population  and  in 
the  most  extreme  high  and  low  lines  after  35  and  34  generations 
of  selection.  (From  Clayton,  Morris,  and  Robertson,  1957;  re- 
produced by  courtesy  of  the  authors  and  the  editor  of  the  Journal 
of  Genetics.) 


increased  thorax  length  returned  rapidly  to  the  unselected  level 
when  the  direction  of  selection  was  reversed  after  the  upward  res- 
ponses had  ceased.  The  lines  selected  for  reduced  thorax  length, 
however,  did  not  respond  to  reversed  selection.  From  this  brief 
outline  of  the  evidence  it  is  clear  that  the  simple  theoretical  picture  of 
the  selection  limit  is  not  substantiated  by  experiment.  Instead,  we 
find — not  always  but  often — no  loss  of  phenotypic  variance  and  the 
ability  to  respond  rapidly  to  reversed  selection.  Let  us  now  consider 
what  may  be  the  possible  reasons  for  these  facts,  and  what  conclusions 
about  the  genetic  nature  of  the  selection  limit  can  be  drawn  from 
them. 

1 .  The  failure  of  the  phenotypic  variance  to  decline  may  be  due 
to  an  increase  of  non-genetic  variance  compensating  for  the  expected 
reduction  of  genetic  variance.   With  the  approach  to  fixation  of  the 
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loci  concerned,  and  of  others  linked  to  them,  the  frequency  of  homo- 
zygotes  will  increase.  There  is  evidence,  mentioned  in  Chapter  8 
and  to  be  discussed  more  fully  in  Chapter  15,  that  homozygotes  are 
sometimes  more  variable  from  environmental  causes  than  hetero- 
zygotes.  This  could  cause  an  increase  of  environmental  variance  which 
might  counterbalance  a  reduction  of  genetic  variance;  but  there  is 
little  experimental  evidence  concerning  the  matter. 

2.  If  the  population,  after  the  selection  limit  has  been  reached, 
responds  to  reversed  selection  we  can  only  conclude  that  genetic 
variance  of  some  sort  remains.  The  continued  presence  of  genetic 
variance  could  result  from  the  following  causes: 

(i)  We  saw  in  Example  11.6  how  natural  selection  opposed  the 
artificial  selection  for  small  size  in  mice,  partly  because  small  mice 
are  less  fertile  than  large  ones  and  partly  because  the  smallest  mice 
were  sterile.  Natural  selection  acting  in  this  sort  of  way  may  increase 
as  the  population  mean  changes  further  from  the  original  level,  until 
it  becomes  strong  enough  to  counteract  completely  the  artificial 
selection.  The  response  would  then  cease,  but  reversed  selection 
would  be  aided  by  natural  selection  and  the  population  would  res- 
pond. 

(ii)  Selection  may  favour  heterozygotes  at  some  loci.  At  the 
selection  limit  the  genes  would  be  in  equilibrium  at  more  or  less 
intermediate  frequencies,  and  they  would  give  rise  to  genetic  vari- 
ance. But  the  variance  would  be  non-additive,  and  there  would  be 
no  immediate  response  to  reversed  selection.  If  reversed  selection 
were  continued  a  response  would  slowly  develop  and  become  more 
rapid  as  the  gene  frequencies  changed  away  from  the  equilibrium 
values.  The  behaviour  of  populations  at  the  selection  limit,  however, 
does  not  seem  commonly  to  be  of  this  sort. 

(iii)  If  there  is  superiority  of  heterozygotes  arising  from  the  com- 
bined action  of  artificial  and  natural  selection  then  the  situation  is 
quite  different.  Consider  a  locus  at  which  the  heterozygote  AjA2  is 
superior  in  the  character  selected  to  the  homozygote  AjA^,  and  the 
homozygote  A2A2  is  inviable  or  sterile.  Artificial  selection  will  choose 
A]A2,  or  perhaps  A2A2  if  it  is  viable,  but  natural  selection  will  reject 
A2A2,  so  that  under  the  combined  effect  of  artificial  and  natural 
selection  the  heterozygote  is  superior.  The  pygmy  gene  in  mice 
which  was  used  for  several  examples  in  Chapter  7  provides  just  such  a 
case,  when  artificial  selection  is  in  the  direction  of  small  size.  Hetero- 
zygotes are  favoured  because  they  are  smaller  than  normal  homozy- 
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gotes;  homozygous  pygmies  are  smaller  still  but  are  sterile.  When  the 
selection  limit  is  reached  under  this  situation  there  will  be  genetic 
variance  due  to  the  gene,  but  no  further  response.  When  selection  is 
reversed,  however,  it  is  only  the  artificial  selection  that  is  reversed  in 
direction,  and  one  homozygote  will  be  favoured.  The  population  will 
therefore  respond  immediately.  This  may  be  regarded  as  an  extreme 
form  of  asymmetrical  response  to  selection.  It  leads  to  the  anomaly  of 
a  high  heritability — about  50  per  cent — estimated  from  the  offspring- 
parent  regression,  but  a  realised  heritability  of  zero  in  one  direction 
and  up  to  100  per  cent  in  the  other  direction.  The  anomaly,  however, 
is  only  apparent  because  the  estimation  of  heritability  and  the  pre- 
diction of  the  response  to  selection  are  valid  only  if  natural  selection 
does  not  interfere  with  the  appearance  of  the  genotypes  in  their  proper 
Mendelian  ratios. 

The  situation  described  above  was  proved  to  exist  in  one  of  the 
lines  of  Drosophila  selected  for  high  bristle  number  in  the  experiment 
illustrated  in  Fig.  12.1.  There  was  a  gene  present  which  was  lethal 
in  the  homozygote  and  which  in  the  heterozygote  increased  bristle 
number  by  22,  which  is  5-8  times  the  original  phenotypic  standard 
deviation  (Clayton  and  Robertson,  1957).  The  line  carrying  this 
gene  was  the  one  whose  distribution  is  shown  in  Fig.  12.4,  and  the 
bimodality  of  the  distribution  can  be  seen.  It  seems  probable  that 
in  cases  like  this  the  gene  does  not  have  so  large  an  effect  in  the  original 
population,  but  that  the  effect  of  the  heterozygote  is  enhanced  during 
the  selection,  either  by  "modifying"  genes  or  by  a  cross-over  which 
separates  a  linked  gene  whose  effect  is  in  the  opposite  direction.  A 
mechanism  of  this  sort  seems  to  be  required  to  account  for  the  very 
great  increase  of  variance  often  found  in  selected  lines  (F.  W.  Robert- 
son and  Reeve,  19520;  Clayton  and  Robertson,  1957). 

The  selection  of  heterozygotes  at  one  or  a  few  loci  with  major 
effects  through  the  combined  action  of  artificial  and  natural  selection 
in  the  manner  explained  above  seems  to  be  a  common  situation  in 
Drosophila  populations  at  the  selection  limit.  Whether  it  occurs  as 
frequently  in  other  organisms  is  not  known  because  the  genetic 
analyses  required  to  detect  it  are  more  difficult  to  make.  The  increase 
of  variance  in  the  mice  selected  for  small  size  shown  in  Fig.  12.3  may 
well  have  been  due  to  this  cause. 

The  deleterious  effect  on  fitness  is  an  essential  part  of  the  situa- 
tion, so  genes  of  this  sort  will  always  be  at  low  frequencies  in  the 
initial  population.  The  appearance  of  any  particular  gene  in  a  selected 
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line  will  therefore  depend  very  much  on  the  chances  of  sampling,  or 
on  its  occurring  later  by  mutation.  Consequently  such  genes  will  be  a 
cause  of  differences  between  replicated  lines,  such  as  we  noted  at  the 
beginning  of  this  chapter  in  the  experiment  on  Drosophila  bristle 
number,  and  they  will  render  the  selection  limit  to  a  large  extent 
unpredictable  in  its  level  and  its  precise  genetic  nature. 

Relevance  of  selection  limits  to  animal  and  plant  improve- 
ment. It  may  be  thought  that  experimental  studies  of  long  continued 
selection  are  of  little  relevance  to  the  practice  of  selection  in  animal 
and  plant  improvement,  because  the  breeder  is  concerned  only  with 
the  first  five  or  ten  generations.  This,  however,  is  not  necessarily  so. 
The  breeds  of  animals  and  varieties  of  plants  which  he  seeks  to  im- 
prove have  already  been  under  selection  for  more  or  less  the  same 
characters  over  a  long  time.  They  may  therefore  by  now  be  approach- 
ing, if  they  are  not  already  at,  the  selection  limits.  An  understanding 
of  the  nature  of  the  selection  limit  and  of  the  behaviour  of  populations 
at  the  selection  limit  may  therefore  be  very  relevant  in  the  field  of 
practice. 


CHAPTER    13 

SELECTION: 

III.  Information  from  Relatives 

In  our  consideration  of  selection  we  have  up  to  now  supposed  that 
individuals  are  measured  for  the  character  to  be  selected  and  that  the 
best  are  chosen  to  be  parents  in  accordance  with  the  individual  pheno- 
typic  values.  An  individual's  own  phenotypic  value,  however,  is  not 
the  only  source  of  information  about  its  breeding  value;  additional 
information  is  provided  by  the  phenotypic  values  of  relatives,  particu- 
larly by  those  of  full  or  half  sibs.  With  some  characters,  indeed,  the 
values  of  relatives  provide  the  only  available  information.  Milk- 
yield,  to  take  an  obvious  example,  cannot  be  measured  in  males,  so 
the  breeding  value  of  a  male  can  only  be  judged  from  the  phenotypic 
values  of  its  female  relatives.  Ovarian  response  to  gonadotropic 
hormone,  a  character  for  which  selection  has  been  applied  in  rats 
(Kyle  and  Chapman,  1953),  cannot  be  measured  on  the  living  animal, 
so  selection  can  only  be  based  on  the  phenotypic  values  of  female 
relatives.  The  use  of  information  from  relatives  is  of  great  importance 
in  the  application  of  selection  to  animal  breeding,  for  two  reasons. 
First,  the  characters  to  be  selected  are  often  ones  of  low  heritability, 
and  with  these  the  mean  value  of  a  number  of  relatives  often  provides 
a  more  reliable  guide  to  breeding  value  than  the  individual's  own 
phenotypic  value.  And,  second,  when  the  outcome  of  selection  is  a 
matter  of  economic  gain  even  quite  a  small  improvement  of  the 
response  will  repay  the  extra  effort  of  applying  the  best  technique. 
In  this  chapter  we  shall  outline  the  principles  underlying  the  use  of 
information  from  relatives  and  the  choice  of  the  best  method  of 
selection,  but  we  shall  not  discuss  the  technical  details  of  procedure 
in  the  application  of  selection  to  animal  breeding. 

Methods  of  Selection 

If  the  family  structure  of  the  population  is  taken  into  account  we 
can  compute  the  mean  phenotypic  value  of  each  family;  this  is  known 
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as  the  family  mean.  Suppose,  then,  that  we  have  a  population  in  which 
the  individuals  are  grouped  in  families,  which  may  be  full  or  half  sibs, 
and  we  have  measurements  of  each  individual  and  of  the  means  of 
every  family.  A  choice  of  procedure  for  applying  selection  to  this 
population  is  then  open,  according  to  the  use  we  make  of  the  family 
means.  Let  us  first  look  at  the  problem  from  the  point  of  view  of  the 
additional  information  provided  by  the  values  of  relatives.  Suppose, 
for  example,  that  we  have  an  individual  whose  own  value  puts  it  on 
the  border-line  between  selection  and  rejection,  and  it  has  a  number 
of  sibs  with  high  values,  so  that  the  family  to  which  it  belongs  has  a 
high  mean.  We  may  interpret  the  situation  in  one  of  two  ways. 
Either  we  may  say  that  the  individual's  own  rather  poor  value  has 
been  due  to  poor  environmental  circumstances,  and  that  the  high 
family  mean  suggests  that  its  breeding  value  is  likely  to  be  a  good  deal 
better  than  its  phenotypic  value.  Or  we  may  say  that  the  high  family 
mean  has  been  due  to  a  favourable  common  environment,  provided 
perhaps  by  a  good  mother,  from  which  the  individual  in  question 
must  also  have  benefited;  on  this  interpretation,  therefore,  the  in- 
dividual's breeding  value  is  likely  to  be  less  good  than  its  phenotypic 
value.  In  the  first  case  we  should  regard  the  information  from  the 
relatives  as  favourable  and  we  should  select  the  individual  in  question, 
while  in  the  second  case  we  should  regard  it  as  unfavourable  and 
should  reject  the  individual.  Here  then  is  the  problem:  how  do  we 
decide  which  is  the  correct  interpretation  ?  It  turns  out  that  only  three 
things  need  be  known:  the  kind  of  family  (whether  full  or  half  sibs), 
the  number  of  individuals  in  the  families  (i.e.  the  family  size),  and  the 
phenotypic  correlation  between  members  of  the  families  with  respect 
to  the  character.  The  choice  of  method  is  thus  a  relatively  simple 
matter  in  practice.  But  the  explanation  of  the  principles  underlying 
the  choice  is  more  complicated.  Before  embarking  on  this  explana- 
tion we  shall  therefore  give  a  brief  general  account  of  the  different 
methods  of  selection  according  to  the  use  made  of  the  information 
from  relatives,  indicating  the  circumstances  to  which  each  method 
is  specially  suited.  Then  we  shall  explain  how  the  response  expected 
under  each  method  is  deduced;  and  finally  we  shall  compare  the 
relative  merits  of  the  methods  under  different  circumstances. 

The  phenotypic  value  of  an  individual,  P,  measured  as  a  deviation 
from  the  population  mean,  is  the  sum  of  two  parts:  the  deviation  of 
its  family  mean  from  the  population  mean,  Pfy  and  the  deviation  of  the 
individual  from  the  family  mean,  Pw  (the  within-family  deviation); 
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The  procedure  of  selection,  then,  varies  according  to  the  attention 
paid,  or  the  weight  given,  to  these  two  parts.  If  we  select  on  the  basis 
of  individual  values  only,  as  assumed  in  the  last  two  chapters,  we  give 
equal  weight  to  the  two  components  Pf  and  Pw  of  the  individual's 
value  P.  This  is  known  as  individual  selection.  We  may,  alternatively, 
select  on  the  basis  of  the  family  mean  Pf  alone,  disregarding  the 
within-family  deviation  Pw  entirely.  This  is  known  as  family  selection 
and  it  corresponds  to  the  procedure  adopted  in  the  first  case  discussed 
above.  Again,  we  may  select  on  the  basis  of  the  within-family  devia- 
tion Pw  alone,  disregarding  the  family  mean  Pf  entirely.  This  is 
known  as  within-family  selection  and  it  corresponds  to  the  second  case 
discussed  above.  Finally,  we  may  take  account  of  both  components 
Pf  and  Pw  but  give  them  different  weights  chosen  so  as  to  make  the 
best  use  of  the  two  sources  of  information.  This  is  known  as  selection 
by  optimum  combination,  or  combined  selection.  It  represents  the 
general  solution  for  obtaining  the  maximum  rate  of  response,  and  the 
other  three  simpler  methods  are  special  cases  in  which  the  weights 
given  to  the  two  sources  of  information  are  either  i  or  o.  It  is  there- 
fore in  principle  always  the  best  method.  But  its  advantage  over  one 
or  other  of  the  simpler  methods  is  never  very  great,  and  it  is  a  refine- 
ment that  is  not  often  worth  while  in  practice.  Beyond  showing  why 
this  is  so,  we  shall  therefore  not  give  very  much  attention  to  combined 
selection. 

The  salient  features  of  the  three  simpler  methods  are  as  follows, 
the  differences  of  procedure  between  them  being  illustrated  diagram- 
matically  in  Fig.  13. 1. 

Individual  selection.  Individuals  are  selected  solely  in  accord- 
ance with  their  own  phenotypic  values.  This  method  is  usually  the 
simplest  to  operate  and  in  many  circumstances  it  yields  the  most  rapid 
response.  It  should  therefore  be  used  unless  there  are  good  reasons 
for  preferring  another  method.  Mass  selection  is  a  term  often  used  for 
individual  selection,  especially  when  the  selected  individuals  are  put 
together  en  masse  for  mating,  as  for  example  Drosophila  in  a  bottle. 
The  term  individual  selection  is  used  more  specifically  when  the 
matings  are  controlled  or  recorded,  as  with  mice  or  larger  animals. 

Family  selection.  Whole  families  are  selected  or  rejected  as 
units  according  to  the  mean  phenotypic  value  of  the  family.    In- 
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dividual  values  are  thus  not  acted  on  except  in  so  far  as  they  determine 
the  family  mean.  In  other  words  the  within-family  deviations  are 
given  zero  weight.  The  families  may  be  of  full  sibs  or  half  sibs,  families 
of  more  remote  relationship  being  of  little  practical  significance.  The 
use  of  full-sib  families  is  dependent  on  a  high  reproductive  rate  and 
with  slow-breeding  organisms  half  sibs  must  generally  be  used. 
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WITHIN-FAMILY  SELECTION 


Fig.  i 3. i.  Diagram  to  illustrate  the  different  methods  of  selec- 
tion. The  dots  and  circles  represent  individuals  plotted  on  a 
vertical  scale  of  merit,  those  with  the  best  measurements  being  at 
the  top.  The  individuals  to  be  selected  are  those  shown  as  dots. 
There  are  5  families  each  with  5  individuals;  {a),  {b)y  and  (c)  show 
identical  arrangements  of  the  same  25  individuals.  The  families 
are  separated  laterally,  with  the  individuals  of  each  family  placed 
one  above  the  other.  The  mean  of  each  family  is  shown  by  a  cross- 
bar. The  situation  in  which  within-family  selection  is  most  useful 
is  shown  in  (d),  where  the  variation  between  families  is  very  great 
in  comparison  with  the  variation  within  families.  (Redrawn  from 
Falconer,  1957a.) 

The  chief  circumstance  under  which  family  selection  is  to  be  pre- 
ferred is  when  the  character  selected  has  a  low  heritability.  The 
efficacy  of  family  selection  rests  on  the  fact  that  the  environmental 
deviations  of  the  individuals  tend  to  cancel  each  other  out  in  the  mean 
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value  of  the  family.  So  the  phenotypic  mean  of  the  family  comes  close 
to  being  a  measure  of  its  genotypic  mean,  and  the  advantage  gained  is 
greater  when  environmental  deviations  constitute  a  large  part  of  the 
phenotypic  variance,  or  in  other  words  when  the  heritability  is  low. 
On  the  other  hand,  environmental  variation  common  to  members  of  a 
family  impairs  the  efficacy  of  family  selection.  If  this  component  is 
large,  as  illustrated  in  Fig.  13. i  (d)y  it  will  tend  to  swamp  the  genetic 
differences  between  families  and  family  selection  will  be  corre- 
spondingly ineffective.  Another  important  factor  in  the  efficacy  of 
family  selection  is  the  number  of  individuals  in  the  families,  or  the 
family  size.  The  larger  the  family  the  closer  is  the  correspondence 
between  mean  phenotypic  value  and  mean  genotypic  value.  So  the 
conditions  that  favour  family  selection  are  low  heritability,  little 
variation  due  to  common  environment,  and  large  families. 

There  are  practical  difficulties  in  the  application  of  family  selec- 
tion, particularly  in  laboratory  populations.  They  arise  from  the 
conflict  between  the  intensity  of  selection  and  the  avoidance  of  in- 
breeding. It  is  generally  desirable  to  keep  the  rate  of  inbreeding  as 
low  as  possible.  If  the  minimum  number  of  parents  is  fixed  by  con- 
siderations of  inbreeding — say  at  ten  pairs — then  under  family 
selection  ten  families  must  be  selected,  since  each  family  represents 
only  one  pair  of  parents  in  the  previous  generation.  And,  if  a  reason- 
ably high  intensity  of  selection  is  to  be  achieved,  the  number  of 
families  bred  and  measured  must  be  perhaps  twice  to  four  times  this 
number.  Family  selection  is  thus  costly  of  space,  and  if  breeding  space 
is  limited  the  intensity  of  selection  that  can  be  achieved  under  family 
selection  may  be  quite  small.  The  two  following  methods  are  variants 
of  family  selection. 

Sib  selection.  Some  characters,  we  have  already  noted,  cannot 
be  measured  on  the  individuals  that  are  to  be  used  as  parents,  and 
selection  can  only  be  based  on  the  values  of  relatives.  This  amounts 
to  family  selection  but  with  the  difference  that  now  the  selected  indi- 
viduals have  not  contributed  to  the  estimate  of  their  family  mean. 
The  difference  affects  the  way  in  which  the  response  is  influenced  by 
family  size.  Where  the  distinction  is  of  consequence  we  shall  use  the 
term  sib  selection  when  the  selected  individuals  are  not  measured  and 
family  selection  when  they  are  measured  and  included  in  the  family 
mean.  When  families  are  very  large  the  two  methods  are  equivalent, 
and  the  term  family  selection  is  then  to  be  understood  to  cover  both. 

Progeny  testing  is  a  method  of  selection  widely  applied  in  ani- 
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mal  breeding.  We  shall  not  discuss  it  in  detail,  except  in  so  far  as  it 
can  be  treated  as  a  form  of  family  selection.  The  criterion  of  selection, 
as  the  name  implies,  is  the  mean  value  of  an  individual's  progeny. 
At  first  sight  this  might  seem  to  be  the  ideal  method  of  selection  and 
the  easiest  to  evaluate  because,  as  we  saw  in  Chapter  7,  the  mean 
value  of  an  individual's  offspring  comes  as  near  as  we  can  get  to  a 
direct  measure  of  its  breeding  value,  and  is  in  fact  the  operational 
definition  of  breeding  value.  In  practice,  however,  it  suffers  from  the 
serious  drawback  of  a  much  lengthened  generation  interval,  because 
the  selection  of  the  parents  cannot  be  carried  out  until  the  offspring 
have  been  measured.  The  evaluation  of  selection  by  progeny  testing 
is  apt  to  be  rather  confusing  because  of  the  inevitable  overlapping  of 
generations,  and  because  of  a  possible  ambiguity  about  which  genera- 
tion is  being  selected,  the  parents  or  the  progeny.  The  progeny, 
whose  mean  is  used  to  judge  the  parents,  are  ready  to  be  used  as 
parents  just  when  the  parents  have  been  tested  and  await  selection. 
Thus  both  the  selected  parents  and  their  progeny  are  used  con- 
currently as  parents.  The  difficulty  of  interpretation  may  be  partially 
overcome  by  regarding  progeny  testing  as  a  modified  form  of  family 
selection.  The  progenies  are  families,  usually  of  half  sibs,  and  selec- 
tion is  made  between  them  on  the  basis  of  the  family  means  in  the 
manner  described  above.  The  only  difference  is  that  the  selected 
families  are  increased  in  size  by  allowing  their  parents  to  go  on  breed- 
ing. The  additional,  younger,  members  of  the  families  do  not  con- 
tribute to  the  estimates  of  the  family  means  and  are  therefore  selected 
by  sib  selection.  Increasing  the  size  of  the  selected  families  by  un- 
measured individuals  does  not  improve  the  accuracy  of  the  selection, 
but  it  reduces  the  replacement  rate  and  so  increases  the  intensity  of 
selection  that  can  be  achieved.  This  is  the  principal  advantage  of 
progeny  testing,  but  it  can  only  be  realised  in  operations  on  a  large 
scale,  when  the  danger  of  inbreeding  is  not  introduced  by  limitation 
of  space. 

Within-family  selection.  The  criterion  of  selection  is  the 
deviation  of  each  individual  from  the  mean  value  of  the  family  to 
which  it  belongs,  those  that  exceed  their  family  mean  by  the  greatest 
amount  being  regarded  as  the  most  desirable.  This  is  the  reverse  of 
family  selection,  the  family  means  being  given  zero  weight.  The  chief 
condition  under  which  this  method  has  an  advantage  over  the  others 
is  a  large  component  of  environmental  variance  common  to  members 
of  a  family.   Fig  13. 1  (d)  shows  how  within-family  selection  would  be 
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applied  in  this  situation.  Pre-weaning  growth  of  pigs  or  mice  might 
be  cited  as  examples  of  such  a  character.  A  large  part  of  the  variation 
of  individuals'  weaning  weights  is  attributable  to  the  mother  and  is 
therefore  common  to  members  of  a  family.  Selection  within  families 
would  eliminate  this  large  non-genetic  component  from  the  variation 
operated  on  by  selection.  An  important  practical  advantage  of  selec- 
tion within  families,  especially  in  laboratory  experiments,  is  that  it 
economises  breeding  space,  for  the  same  reason  that  family  selection 
is  costly  of  space.  If  single-pair  matings  are  to  be  made,  then  two 
members  of  every  family  must  be  selected  in  order  to  replace  the 
parents.  This  means  that  every  family  contributes  equally  to  the 
parents  of  the  next  generation,  a  system  that  we  saw  in  Chapter  4 
renders  the  effective  population  size  twice  the  actual.  Thus  when 
selection  within  families  is  practised,  the  breeding  space  required  to 
keep  the  rate  of  inbreeding  below  a  certain  value  is  only  half  as  great 
as  would  be  required  under  individual  selection. 


Expected  Response 

To  evaluate  the  relative  merits  of  the  different  methods  of  selec- 
tion we  have  to  deduce  the  response  expected  from  each.  There  is 
nothing  to  be  added  here  about  individual  selection  to  what  was  said 
in  Chapter  11.  The  expected  response  was  given  in  equation  11. 3  as 
R=icrph?,  where  i  is  the  intensity  of  selection  (i.e.  the  selection 
differential  in  standard  deviations),  gp  is  the  standard  deviation,  and 
W  the  heritability,  of  the  phenotypic  values  of  individuals.  The 
response  expected  under  family  selection  or  within-family  selection 
is  arrived  at  in  an  analogous  manner.  Under  family  selection,  the 
criterion  of  selection  is  the  mean  phenotypic  value  of  the  members  of 
a  family,  so  the  expected  response  to  family  selection  is 


Rf=icrfh2f 


to-2) 


where  i  is  the  intensity  of  selection,  of  is  the  observed  standard 
deviation  of  family  means,  and  hj  is  the  heritability  of  family  means. 
In  the  same  way  the  expected  reponse  to  within-family  selection  is 


Rw=icrl 


{13.3) 

where  ow  is  the  standard  deviation,  and  h\  the  heritability  of  within- 
family  deviations. 

F.Q.G. 
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The  concept  of  heritability  applied  to  family  means  or  to  within- 
family  deviations  introduces  no  new  principle.  It  is  simply  the  pro- 
portion of  the  phentoypic  variance  of  these  quantities  that  is  made 
up  of  additive  genetic  variance.  These  heritabilities  can  be  expressed 
in  terms  of  the  heritability  of  individual  values  (which  we  shall  con- 
tinue to  refer  to  simply  as  the  heritability,  with  symbol  A2),  the  pheno- 
typic  correlation  between  members  of  families,  and  the  number  of 
individuals  in  the  families,  all  of  which  can  be  estimated  by  observa- 
tion. To  arrive  at  the  appropriate  expressions  we  have  to  consider 
again  how  the  observational  components  of  variance  are  made  up  of 
the  causal  components,  as  explained  in  Chapters  9  and  10  (see  in 
particular  Tables  9.4  and  10.4).  First  let  us  simplify  matters  by 
supposing  that  all  families  contain  a  large  number  of  individuals,  so 
that  the  means  of  all  families  are  estimated  without  error.  Consider 
first  the  phenotypic  variance.  The  intra-class  correlation,  t,  between 
members  of  families  is  the  between-group  component  divided  by  the 
total  variance:  t  —  G%ju^.  Therefore  the  between-group  component 
can  be  expressed  as  G%  —  tG%,  and  the  within-group  component  as 
<7jp  =  (i -£)crf..  This  expresses  the  partitioning  of  the  phenotypic 
variance  into  its  observational  components.  The  total  variance, 
written  here  as  oy,  is  the  phenotypic  variance  which  we  shall  write 
as  VP  in  the  context  of  causal  components.  Now,  the  partitioning  of 
the  additive  variance  between  and  within  families  can  be  expressed 
in  the  same  way,  in  terms  of  the  correlation  of  breeding  values,  for 
which  we  shall  use  the  symbol  r.  (The  meaning  of  this  correlation 
will  be  explained  in  a  moment.)  Thus  the  additive  variance  between 
families  is  rVA  and  the  additive  variance  within  families  is  (1  -r)VA. 
The  dual  partitioning  is  summarised  in  Table  13.1. 

Table  13.  i 

Partitioning  of  the  variance  between  and  within  families  of 
large  size. 

Observational  component  Additive  variance  Phenotypic  variance 

Between  families,  0%  rVA  tVP 

Within  families,    al  (i-r)VA  (i-t)VP 

This  partitioning  of  both  the  additive  and  the  phenotypic  variance 
leads  at  once  to  the  heritabilities  of  family  means  and  of  within- 
family  deviations,  since  these  heritabilities  are  simply  the  ratios  of 
the  additive  variance  to  the  phenotypic  variance.    Thus,  when  the 
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families  are  large,  the  heritability  of  family  means  is  rVAjtVPi  or  (r/t)h2, 
since  VAjVP  is  the  heritability  of  individual  values,  h2. 

The  correlation  of  breeding  values  between  members  of  families 
is  a  measure  of  the  degree  of  relationship,  usually  called  the  "coeffi- 
cient of  relationship."  The  correlation  between  the  breeding  values 
of  relatives  in  a  random-mating  population  is  twice  their  coancestry 


r  =  2f 


(13.4) 


that  is  to  say,  twice  the  inbreeding  coefficient  of  their  progeny  if  the 
relatives  were  mated  together.  Its  values  in  full-sib  and  half-sib 
families  can  be  seen  from  Table  9.4;  for  full  sibs  it  is  \  and  for  half 
sibs  it  is  J.  In  order  to  be  able  to  discuss  full-sib  and  half-sib  families 
at  the  same  time  in  what  follows,  we  shall  retain  the  symbol  r  in  the 
formulae  instead  of  inserting  the  appropriate  values  of  \  or  \. 

The  foregoing  account  of  the  heritabilities  of  family  means  and 
within-family  deviations  was  simplified  by  the  supposition  of  large 
families.  This  simplification  is  not  justified  in  practice  and  we  must 
now  remove  it  by  considering  families  of  finite  size.  We  shall,  how- 
ever, suppose  that  all  families  are  of  equal  size.  The  number  of 
individuals  in  a  family — called  the  family  size — has  to  be  taken  into 
consideration  for  the  following  reason.  If  selection  is  based  on  the 
family  mean,  or  on  the  deviations  from  the  family  mean,  then  it  is  the 
observed  mean  that  we  are  concerned  with  and  not  the  true  mean.  In 
other  words  we  are  not  concerned  with  the  observational  components 
of  variance  which  we  have  hitherto  discussed,  but  with  the  variance  of 
the  observed  means  and  of  the  observed  within-family  deviations. 
The  observed  means  of  groups  are  subject  to  sampling  variance  which 
comes  from  the  within-group  variance.  If  there  are  n  individuals  in  a 
group  then  the  sampling  variance  of  the  group-mean  is  (i/n)  o>,  where 
&w  is  the  component  of  variance  within  the  group.  Thus  the  variance  of 
observed  group-means  is  augmented  by  (i/w)  afVy  and  the  variance  of 

Table  13.2 
Composition  of  observed  variances  with  families  of  size  n. 


Observed  variance 

of  family  means 

of  within-family 
deviations 


Observational 
components 


ctJ  +  -  °w 
n 


°W 


Causal  components 
Additive  Phenotypic 

i+(n-i)r  i+(n-i)t. 


V, 


(n-i)(i-r)  v         (n-i){i-t) 
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observed  deviations  within  groups  is  correspondingly  diminished  by 
the  same  amount.  The  observed  variances,  with  family  size  w,  are 
therefore  made  up  of  the  observational  components  as  shown  in 
Table  13.2.  The  causal  components  entering  into  the  observed 
variances  can  now  be  found  by  translating  the  observational  com- 
ponents into  causal  components  from  Table  13. 1.  They  are  shown  in 
the  two  right-hand  columns  of  Table  13.2. 

To  find  the  heritabilities  of  family  means  and  of  within-family 
deviations  we  have  only  to  divide  the  additive  component  by  the 
phenotypic  component  of  the  observed  variances.  Thus  the  herit- 
ability  of  family  means  is 

I+(„_IK2 

3     i+(n-i)t 
and  the  heritability  of  within-family  deviations  is 


h2 


At  this  point  sib  selection  has  to  be  distinguished  from  family 
selection.  The  foregoing  account  referred  to  family  selection  where 
the  individuals  to  be  selected  were  themselves  measured  and  contributed 
to  the  observed  family  mean.  Sib  selection  differs  in  that  the  individuals 
selected  are  not  measured.  This  does  not  affect  the  phenotypic  com- 
ponent, because  this  is  simply  the  observed  variance  of  what  is 
measured.  But  it  does  affect  the  additive  component,  because  the 
mean  breeding  value  with  which  we  are  concerned  is  not  that  of  the 
individuals  whose  phenotypic  values  have  been  measured,  but  of 
others  that  have  not  been  measured.  Therefore  the  appropriate 
variance  of  mean  breeding  values  is  simply  the  between-family  com- 
ponent of  additive  variance,  rVA,  irrespective  of  the  number  of  other 
individuals  that  have  been  measured.  The  heritability  of  family 
means  appropriate  to  sib  selection  is  therefore 


hl  = 


nr 


i+(n-  i)t 


The  heritabilities  of  the  different  methods  of  selection,  whose  deriva- 
tions have  now  been  explained,  are  listed  in  Table  13.3. 

To  deduce  the  expected  response  is  now  a  simple  matter.   Let  us 
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take  family  selection  for  illustration.  The  expected  response  was 
given  in  equation  13.2  as 

Rf  =  i(jfh} 

where  crf  is  the  standard  deviation  of  observed  family  means.  This 
expression,  however,  is  not  much  use  as  it  stands,  because  it  does  not 
readily  allow  a  comparison  to  be  made  with  the  other  methods.  It 
will  be  most  convenient  to  cast  it  into  a  form  that  facilitates  compari- 
son with  individual  selection.   This  can  be  done  by  substituting  the 

Table  13.3 

Heritability  and  expected  response  under  different  methods 
of  selection. 


Method  of 
selection 

Individual 
Family        h}  =  h2 


Sib 

Within- 
family 

Combined 


hl=h 


Heritability 
h2 
1  +(n-i)r 
i+(n-i)t 

nr 


'  i+(n-i)t 


R  =  icrPh2 
Rt  =  iaPh2  . 

Rs  =  iaPh2 . 


Expected  response 

i+(n-  i)r 

sln{i+(n-i)t) 

nr 


hi  =  hK 


(i-r) 

(i-O 


Jn{i+(n-i)t} 

VL     (i-O   i+(»-i)u 


R 


(i-t) 

i  =  intensity  of  selection  (selection  differential  in  standard  measure): 
assumed  to  be  equal  for  all  methods,  but  not  necessarily  so. 
oP  =  standard  deviation  in  phenotypic  values  of  individuals. 
h2  =  heritability  of  individual  values. 
r:      with  full-sib  families,  r  =  \ 
with  half-sib  families,  r  =  J 
t  =  correlation  of  phenotypic  values  of  members  of  the  families. 
n  =  number  of  individuals  in  the  families. 

expression  for  the  heritability  of  family  means,  h},  given  above,  and 
by  putting  the  standard  deviation  of  observed  family  means,  oy,  in 
terms  of  the  standard  deviation  of  individual  phenotypic  values, 
°p(  =  JVp)  from  the  right-hand  column  of  Table  13.2.  The  expected 
response  then  becomes 
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Rf  =  i    h  Hn-*)t  i+(n-i)r 

x     V        w  i+(rc-i)* 

which  reduces  to 

j^v^r  '+(»-* ;'i 

'         L>/[»{i+(»-i)*}]J 

The  term  i<jPh2  is  equivalent  to  the  expected  response  under  indi- 
vidual selection,  so  the  expression  within  the  square  brackets  is  the 
factor  that  compares  family  selection  with  individual  selection.  The 
expression  looks  very  complicated  but  it  contains  only  three  simple 
quantities:  n,  which  is  the  family  size;  r,  which  is  \  for  full-sib  and 
\  for  half-sib  families;  and  t>  which  is  the  phenotypic  intra-class 
correlation. 

The  expected  responses  under  the  different  methods  of  selection 
are  listed  in  Table  13.3,  all  expressed  in  this  manner  which  allows 
the  comparisons  to  be  made  with  individual  selection.  The  relative 
merits  of  the  different  methods  will  be  discussed  in  the  next  section: 
first  we  must  deal  with  combined  selection. 

Combined  selection.  We  shall  deal  very  briefly  with  combined 
selection,  referring  the  reader  to  Lush  (1947),  Lerner  (1950)  and 
A.  Robertson  (1955a)  for  details.  First  we  have  to  find  what  are  the 
appropriate  weighting  factors  to  be  used  in  its  application.  We  saw 
before  that  the  phenotypic  value  of  an  individual  is  made  up  of  two 
parts,  the  family  mean  and  the  within-family  deviation,  P=Pf+Pw, 
and  that  each  part  gives  some  information  about  the  individual's 
breeding  value.  In  Chapter  10  we  saw  that  the  heritability  is  equi- 
valent to  the  regression  of  breeding  value  on  phenotypic  value 
(equation  J0.2),  so  that  the  best  estimate  of  an  individual's  breeding 
value  to  be  derived  from  its  phenotypic  value  is  h2P.  This  idea  can 
be  applied  separately  to  the  two  parts  of  the  phenotypic  value,  since 
these  are  uncorrelated  and  supply  independent  information  about  the 
breeding  value.  Therefore,  taking  both  parts  of  the  phenotypic  value 
into  account,  the  best  estimate  of  an  individual's  breeding  value  is 
given  by  the  multiple  regression  equation 

expected  breeding  value = hjPf  +  h%Pw 

(Pf  being  measured  as  a  deviation  from  the  population  mean,  and  Pw 
as  a  deviation  from  the  family  mean).  The  weighting  factors  that 
make  the  most  efficient  use  of  the  two  sources  of  information  are 
therefore  the  two  heritabilities,  appropriate  to  family  means  and  to 
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within-family  deviations  respectively.    The  criterion  of  selection 
under  combined  selection  is  thus  an  index,  /,  in  the  form 


I=h}Pf  +  h^Pw 


<*3-5) 


If  the  values  of  the  heritabilities  are  inserted  from  Table  13.3  it  will 
be  seen  that  the  term  h2  is  common  to  both  weighting  factors,  and 
this  term  may  therefore  be  omitted  without  affecting  the  relative 
weighting.  We  then  have  an  index  for  the  computation  of  which  only 
n,  r,  and  t  need  be  known.  In  practice  it  is  more  convenient  to  work 
with  the  individual  values  in  place  of  the  within-family  deviations, 
and  to  assign  them  a  weight  of  1 .  The  family  mean  is  thus  used  in  the 
manner  of  a  correction,  supplementing  the  information  provided  by 
the  individual  itself.  Rearrangement  of  the  appropriate  weighting 
factor  for  the  family  mean  leads  to  an  index  made  up  as  follows  (Lush, 

1947): 

/=p+r~.  ,*  /]p,      (jj.6) 

\_i-r    i+(n-i)tj    T  v  w»    / 

where  P  is  the  individual  value  and  Pf  the  family  mean,  in  which  the 
individual  itself  is  included. 

This  solution  of  the  problem  of  how  we  can  best  make  use  of  the 
information  provided  by  relatives  is  now  cast  in  precisely  the  form 
in  which  the  problem  was  introduced  at  the  beginning  of  this 
chapter.  The  expression  in  the  square  brackets  in  equation  JJ.6, 
which  contains  nothing  but  easily  measurable  quantities,  shows  how 
we  can  best  use  the  family  mean  to  supplement  the  individual  values 
in  making  the  selection. 

The  expected  response  to  combined  selection,  cast  in  a  form 
suitable  for  comparison  with  individual  selection,  is  given  at  the  foot 
of  Table  13.3.  For  its  derivation  see  Lush  (1947). 


Relative  Merits  of  the  Methods 

The  formulae  for  the  expected  responses  that  we  have  derived 
enable  us  to  compare  one  method  of  selection  with  another  and  dis- 
cover what  are  the  conditions  that  determine  the  choice  of  the  best 
method.  Before  making  detailed  comparisons  let  us  note  the  reason 
for  individual  selection  being  usually  better  than  either  family  selec- 
tion or  within-family  selection.    The  reason  is  that  the  standard 
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deviations  of  family  means  and  of  within-family  deviations  are  both 
bound  to  be  less  than  the  standard  deviation  of  individual  values; 
and  the  standard  deviation  of  the  criterion  of  selection  is  one  of  the 
factors  governing  the  response.  If  we  compare,  for  example,  family 
selection  with  individual  selection  by  writing  the  expected  responses 
in  the  form 

R = icjph2     (for  individual  selection) 
and  Rf=i(7fh}     (for  family  selection) 

then  it  is  clear  that  family  selection  cannot  be  better  than  individual 
selection  unless  the  heritability  of  family  means,  h}i  is  greater  than 
the  heritability  of  individual  values,  W,  by  an  amount  great  enough 
to  counterbalance  the  lower  standard  deviation  of  family  means. 
And  the  same  applies  to  within-family  selection. 

A  general  picture  of  the  circumstances  that  make  one  method 
better  than  another  can  best  be  obtained  from  graphical  representa- 
tions of  the  relative  responses:  that  is,  the  response  expected  from 
one  method  expressed  as  a  proportion  of  the  response  expected  from 
another,  the  expected  responses  being  taken  from  Table  13.3.  In 
making  these  comparisons  we  shall  assume  that  the  intensity  of 
selection  is  the  same  for  all  methods.  Though  not  necessarily  true, 
this  simplification  is  unavoidable  because  no  generalisation  can  be 
made  about  the  proportions  selected  under  the  different  methods. 
We  shall  make  the  comparisons  separately  for  full-sib  families  (r  =  J) 
and  for  half-sib  families  (r  =  J).  Then  the  relative  responses  depend 
only  on  two  factors,  the  family  size,  n>  and  the  intra-class  correlation 
of  phenotypic  values,  t.  If  there  is  no  variance  due  to  common  en- 
vironment contributing  to  the  variance  of  family  means,  then  the 
correlation  in  full-sib  families  is  equal  to  half  the  heritability,  and  that 
in  half-sib  families  to  one  quarter  of  the  heritability.  This  lets  us  see 
in  a  general  way  how  the  heritability  of  the  character  influences  the 
relative  response.  It  is,  however,  the  correlation  and  not  the  herit- 
ability that  is  the  determining  factor,  so  only  the  correlation  need  be 
known  when  a  choice  of  method  is  to  be  made. 

Fig.  13.2  gives  a  general  picture  of  all  the  methods,  showing  how 
their  relative  merits  depend  on  the  phenotypic  correlation.  The 
graphs  refer  only  to  full-sib  families  and  only  to  the  two  extremes  of 
family  size:  infinitely  large  families  in  (a)  and  families  of  2  in  (b). 
The  comparisons  are  made  here  with  combined  selection  since  this  is 
necessarily  the  method  that  gives  the  greatest  response.  The  graphs 
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therefore  show  the  ratio  of  the  response  for  each  method  to  that  for 
combined  selection:  e.g.  for  family  selection,  the  ratio  RfjRc.  The 
general  picture  indicated  by  the  graphs  is  as  follows.  The  relative 
merit  of  individual  selection  is  greatest  when  the  correlation  is  0-5 
and  falls  off  as  the  correlation  drops  below  or  rises  above  this  value. 
The  relative  merit  of  family  selection  is  greatest  when  the  correlation 
is  low,  and  that  of  within-family  selection  when  the  correlation  is 
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Fig.  13.2.  Relative  merits  of  the  different  methods  of  selection, 
with  full-sib  families.  Responses  relative  to  that  for  combined 
selection  plotted  against  the  phenotypic  intra-class  correlation,  t. 
/= individual  selection;  F  =  family  selection;  W=  within-family 
selection. 

high.  Now,  a  low  correlation  between  sibs  can  only  result  from  a 
character  of  low  heritability,  and  with  very  little  variance  due  to 
common  environment.  These  therefore  are  the  circumstances  that 
favour  family  selection.  A  high  correlation  can  only  result  from  a 
large  amount  of  variance  due  to  common  environment.  Even  if  the 
heritability  were  100  per  cent  the  correlation  between  full  sibs  could 
not  exceed  0*5  without  augmentation  by  common  environment.  A 
large  amount  of  variation  due  to  common  environment  is  therefore 
the  circumstance  that  favours  within-family  selection.  We  shall 
examine  the  three  simpler  methods  in  more  detail  in  a  moment. 
First  let  us  look  at  what  may  be  gained  from  combined  selection. 
Though  combined  selection  is  always  as  good  as  or  superior  to  any 
other  method,  its  superiority  is  never  very  great.  With  large  families 
its  superiority  is  greatest  when  the  correlation  is  close  to  0-25  or  075, 
but  even  then  its  superiority  is  not  much  more  than  10  per  cent. 
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With  families  of  2  its  superiority  reaches  20  per  cent  when  the  cor- 
relation is  0-875.  Thus  the  range  of  circumstances  under  which 
combined  selection  is  more  than  a  few  per  cent  better  than  one  or 
other  of  the  simpler  methods  is  very  narrow.  In  general,  therefore, 
there  is  little  to  be  gained  from  the  extra  trouble  of  applying  combined 
selection,  and  we  shall  not  give  it  any  further  consideration. 

Let  us  now  examine  the  simpler  methods  in  more  detail.  The 
most  useful  comparison  to  make  now  is  with  individual  selection. 
The  expected  responses  will  therefore  be  expressed  as  a  proportion 
of  the  response  to  individual  selection.  We  shall  examine  each  method 
in  turn,  commenting  on  the  special  questions  that  arise  in  connexion 
with  each. 

Family  selection.  Fig.  13.3  shows  the  relative  response  RfjR 
plotted  against  the  family  size,  n,  for  full-sib  families  in  (a)  and 
for  half-sib  families  in  (b).  These  graphs  therefore  show  primarily  the 
effect  of  family  size  on  the  relative  merit  of  family  selection,  but  the 
magnitude  of  the  correlation,  t,  is  taken  into  account  by  separate 
curves  for  different  correlations.  Only  the  circumstances  when  family 
selection  is  superior  to  individual  selection  are  shown  on  the  graphs. 
The  chief  points  made  clear  by  the  graphs  are  these,  (i)  As  we  saw 
from  Fig.  13.2,  there  is  a  critical  value  of  the  correlation,  above  which 
family  selection  cannot  be  superior  to  individual  selection.  From  the 
expected  responses  in  Table  13.3  it  is  easy  to  show  that  when  the 
families  are  large  the  relative  response  expected  is  Rf/R  =  r/Jt.  So, 
with  large  families,  family  selection  becomes  superior  to  individual 
selection  when  r  exceeds  Jt.  The  critical  value  of  the  correlation,  t, 
depends  a  little  on  the  family  size  and  differs  between  full-sib  and 
half-sib  families.  Family  selection  with  full  sibs  is  very  little  better 
than  individual  selection  unless  the  correlation  is  below  0-2;  and  with 
half  sibs  unless  it  is  below  0-05 .  (ii)  The  effect  of  family  size  is  greatest 
when  the  correlation  is  low.  Therefore  there  is  little  to  be  gained 
from  very  large  families  unless  the  correlation  is  well  below  the  critical 
value.  There  is,  however,  another  consideration  in  connexion  with 
the  family  size  which  will  be  explained  later,  (iii)  Finally,  there  is  the 
question  whether  full  sibs  or  half  sibs  are  to  be  preferred  for  family 
selection.  This  depends  so  much  on  the  special  circumstances  that 
general  conclusions  cannot  be  drawn.  From  the  graphs  it  would 
appear  that  full  sibs  must  always  be  better  than  half  sibs.  But  the 
full-sib  correlation  is  more  likely  to  be  increased  by  common  en- 
vironment, and  full-sib  families  are  likely  to  be  a  good  deal  smaller 
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than  half-sib  families.  Both  these  factors  work  in  favour  of  half-sib 
families.  It  has  been  shown  that  in  selection  for  egg-production  in 
poultry  the  factor  of  family  size  makes  half-sib  families  superior  to 
full  sibs  (Osborne,  19570). 
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Fig.  13.3.  Responses  expected  under  family  selection  relative  to 
that  for  individual  selection,  plotted  against  family  size.  The 
separate  curves  refer  to  different  values  of  the  phenotypic  cor- 
relation, t,  as  indicated.  The  corresponding  values  of  the  heri- 
tability,  h2,  in  the  absence  of  variation  due  to  common  environment, 
are  also  given,    (a)  full-sib  families;  (b)  half-sib  families. 

Sib  selection.  The  use  of  this  method  is  usually  dictated  by 
necessity  rather  than  by  choice,  and  comparisons  with  other  methods 
are  of  less  interest.  The  chief  practical  question  that  arises  concerns 
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the  family  size:  how  many  sibs  should  be  measured?  Or,  how  far  is  it 
worth  while  increasing  family  size  ?  The  effect  of  family  size  on  the 
response  to  sib  selection  is  shown  in  Fig.  13.4.  The  graphs  show  the 
response  with  family  size  nf  as  a  percentage  of  the  response  with 
infinitely  large  families,  which  would  be  the  maximum  possible 
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Fig.  13.4.  Effect  of  family  size  on  the  response  to  sib  selection, 
with  either  full-  or  half-sib  families.  The  expected  response  is 
shown  as  a  percentage  of  the  response  with  infinitely  large  families. 
The  separate  curves  refer  to  different  values  of  the  phenotypic 
correlation,  t,  as  indicated. 

response.  The  graphs  are  valid  for  both  full  and  half  sibs.  Again  the 
effect  of  increasing  family  size  is  greatest  when  the  correlation  is  low. 
But  with  sib  selection  as  with  family  selection  there  is  another  con- 
sideration to  be  taken  into  account  in  connexion  with  the  family  size, 
which  will  now  be  explained. 


Chap.  13]  RELATIVE  MERITS  OF  THE  METHODS  243 

Optimal  family  size.  Though  the  graphs  suggest  that  the  larger 
the  family  size  the  greater  will  be  the  response,  under  both  family 
selection  and  sib  selection,  this  is  not  so  in  practice  because  the  in- 
tensity of  selection  is  involved  as  a  factor  in  the  following  way.  In 
practice  there  is  always  a  limitation  on  the  amount  of  breeding  space 
or  facilities  for  measurement.  The  total  available  space  can  be  filled 
with  a  large  number  of  small  families,  or  with  a  small  number  of  large 
families.  Considerations  of  inbreeding  set  a  lower  limit  to  the  number 
of  families  that  will  be  selected,  so  the  larger  the  number  of  families 
measured  the  greater  will  be  the  intensity  of  selection.  Therefore 
there  is  a  conflict  of  advantage  between  the  size  of  the  families  and 
the  intensity  of  selection:  large  families  lead  to  a  lower  intensity  of 
selection.  When  the  intensity  of  selection  is  taken  into  consideration 
it  turns  out  that  there  is  an  optimal  family  size  which  gives  the 
greatest  expected  response.  The  optimal  family  size  with  half-sib 
families  can  be  found  approximately  from  the  following  simple 
formula  (A.  Robertson,  19576): 


VA  (J*7) 


7Z  =  0-56 

where  n  is  the  otpimal  family  size,  T  is  the  total  number  of  individuals 
that  can  be  accommodated  and  measured,  N  is  the  number  of  families 
to  be  selected,  and  h2  is  the  heritability  of  the  character. 

Within-family  selection.  Fig.  13.5  shows  the  relative  response, 
Rw/R,  for  within-family  selection  applied  to  full-sib  families.  Half- 
sib  families  need  not  be  considered  since  the  method  is  unlikely  to  be 
applied  to  them.  The  graphs  show  primarily  the  effect  of  the  pheno- 
typic  correlation,  t>  on  the  response.  Four  graphs  are  given  repre- 
senting family  sizes  between  2  and  30,  and  it  can  be  seen  that  the 
family  size  does  not  have  a  great  effect.  The  relative  response  when 
the  families  are  very  large  can  be  shown  from  the  expected  responses 
given  in  Table  13.3  to  be  Rw/R  =  (i  -r)/J(i  -i).  So,  with  large 
families,  within-family  selection  will  be  superior  to  individual  selec- 

Ition  if  (1  -  r)  exceeds  J(i  - 1).  The  graphs  in  Fig.  13.5  show  that  the 
correlation,  t>  in  full-sib  families  would  have  to  exceed  about  075  to 
0-85,  according  to  the  family  size.  Correlations  as  high  as  this  cannot 
arise  without  a  large  amount  of  variation  due  to  common  environ- 
ment. Correlations  high  enough  to  make  within-family  selection 
superior  to  individual  selection  are,  however,  not  commonly  found, 
and  the  advantage  of  within-family  selection  therefore  comes  chiefly 
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from  the  reduced  rate  of  inbreeding  which  was  mentioned  earlier. 
Fig.  13.5  shows  how  much  will  be  sacrificed  in  the  rate  of  response  if 
within-family  selection  is  applied.  Most  characters  have  full-sib 
correlations  below  about  0-5,  and  within-family  selection  is  then  only 
about  half  as  effective  as  individual  selection. 
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Fig.  13.5.  Response  expected  under  within-family  selection  rela- 
tive to  that  of  individual  selection,  plotted  against  the  phenotypic 
correlation,  t.  The  separate  curves  refer  to  different  family  sizes, 
as  indicated. 

Weights  to  be  attached  to  families  of  different  size.  Through 
out  this  chapter  we  have  assumed  that  all  families  whose  mean  values 
are  to  be  used  in  selection  have  equal  numbers  of  individuals  in  them; 
i.e.  n  is  the  same  for  all  families.  This  is  a  reasonable  enough  assump- 
tion to  make  when  we  are  considering  the  expected  response  from  the 
point  of  view  of  the  planner  who  has  to  decide  on  the  method  of 
selection  to  be  applied.  But,  in  practice,  families  are  very  seldom  of 
equal  size  and  if  we  are  to  apply  any  method  of  selection  based  on 
family  means  we  are  immediately  faced  with  the  problem  of  how  to 
make  allownace  for  different  numbers  in  the  families.  Obviously  the 
mean  of  a  large  family  is  more  reliable  than  that  of  a  small  one,  and 
should  be  given  more  weight  when  the  selection  is  being  made.  The 
solution  of  the  problem  comes  from  a  consideration  of  the  heritability 
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as  the  regression  of  breeding  value  on  phenotypic  value.  The  best 
estimate  of  the  breeding  value  of  a  family  is  obtained  by  multiplying 
the  family  mean  (measured  as  a  deviation  from  the  population  mean) 
by  the  heritability  of  family  means.  The  appropriate  weighting  factor 
for  family  means  is  therefore  the  heritability  of  family  means,  cal- 
culated separately  for  each  family  according  to  its  size.  Quantities 
that  are  constant  for  all  families  may  be  omitted  without  altering  the 
relative  weights.  Thus,  in  the  application  of  family  selection,  each 
family  mean,  calculated  as  a  deviation  from  the  population  mean, 
should  be  weighted  by  [i  +(n-  i)r]/[i  +(n-  i)t],  and  in  sib  selection  by 
p*/[i  +  (n  -  i)t].  The  heritability  of  within-family  deviations  does  not 
contain  the  term  w,  and  is  therefore  unaffected  by  family  size.  Thus  no 
weighting  is  required  in  the  application  of  within-family  selection.  The 
weighting  factor  to  be  used  in  combined  selection  has  already  been 
given  in  equation  13.6. 

We  conclude  this  chapter  with  an  example  from  a  laboratory 
experiment  which  compared  the  responses  actually  obtained  under 
different  methods  of  selection. 


Example  13.1.  In  an  experiment  with  Drosophila  melanogaster  selec- 
tion for  abdominal  bristle-number  was  made  by  three  methods  (Clayton, 
Morris,  and  Robertson,  1957).  The  responses  to  individual  selection  at 
different  intensities  were  quoted  in  Example  11.2.  Sib  selection  was  also 
applied  in  both  full-sib  and  half-sib  families  and  the  responses  compared 
with  expectation.  Here  we  shall  compare  the  responses  under  sib  selection 
with  the  response  under  individual  selection,  according  to  the  formula  in 
Table  13.3.  The  same  proportion  of  the  population  was  selected  in  each 
case,  namely  20  per  cent,  but  the  intensities  of  selection  under  sib  selection 

Relative  response,  RJR 

Full  sibs         Half  sibs 
Exp.  0-832  0-614 

Obs.  up  0-618  0-527 

Obs.  down  0-919  0-635 


were  lower  than  under  individual  selection  because  there  was  a  smaller 
total  number  of  families  than  of  individuals — 10  half-sib  families,  20 
full-sib  families,  and  100  individuals.  The  intensity  of  selection  under 
individual  selection  was  1-40.  Those  under  sib  selection  are  given  in  the 
table,  together  with  the  other  data  needed  for  calculating  the  expected 
responses  under  sib  selection  relative  to  that  under  individual  selection. 


Data 

Full  sibs 

Half  sibs 

i 

i*33 

1-27 

n 

12 

20 

r 

0-50 

0-275 

t 

0-265 

0-I2I 
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In  applying  the  formula  from  Table  13.3  we  have  to  take  account  of  the 
intensity  of  selection,  multiplying  by  the  ratio  of  the  intensity  under  sib 
selection  to  the  intensity  under  individual  selection.  It  will  be  seen  that 
the  correlation  of  breeding  values,  r,  between  half  sibs  is  a  little  greater 
than  J.  This  is  because  the  females  mated  to  a  male  were  not  entirely 
unrelated  to  each  other.  The  ratios  of  the  responses  expected  and  observed 
are  given  in  the  right-hand  half  of  the  table.  The  expectation  is  that  in- 
dividual selection  should  be  the  best  method,  and  so  it  proved  to  be. 
There  is,  however,  some  discrepancy  between  the  upward  and  downward 
responses,  of  which  the  reason  is  not  known. 
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INBREEDING  AND  CROSSBREEDING: 

I.  Changes  of  Mean  Value 

We  turn  our  attention  now  to  inbreeding,  the  second  of  the  two  ways 
open  to  the  breeder  for  changing  the  genetic  constitution  of  a  popula- 
tion. The  harmful  effects  of  inbreeding  on  reproductive  rate  and 
general  vigour  are  well  known  to  breeders  and  biologists,  and  were 
mentioned  in  Chapter  6  as  one  of  the  two  basic  genetic  phenomena 
displayed  by  metric  characters.  The  opposite,  or  complementary, 
phenomenon  of  hybrid  vigour  resulting  from  crosses  between  inbred 
lines  or  between  different  races  or  varieties  is  equally  well  known,  and 
forms  an  important  means  of  animal  and  plant  improvement.  The 
production  of  lines  for  subsequent  crossing  in  the  utilisation  of 
hybrid  vigour  is  one  of  two  main  purposes  for  which  inbreeding  may 
be  carried  out.  The  other  is  the  production  of  genetically  uniform 
strains,  particularly  of  laboratory  animals,  for  use  in  bioassay  and  in 
research  in  a  variety  of  fields.  Inbreeding  in  itself,  however,  is  almost 
universally  harmful  and  the  breeder  or  experimenter  normally  seeks 
to  avoid  it  as  far  as  possible,  unless  for  some  specific  purpose.  Men- 
tion should  be  made  here  of  naturally  self-fertilising  plants,  to  which 
much  of  the  discussion  in  this  chapter  is  inapplicable.  Since  inbreed- 
ing is  their  normal  mating  system  they  cannot  be  further  inbred:  they 
can,  however,  be  crossed,  but  they  do  not  regularly  show  hybrid 
vigour. 

In  the  treatment  of  inbreeding  given  in  Chapter  3  the  conse- 
quences were  described  in  terms  of  the  expected  changes  of  gene 
frequencies  and  of  genotype  frequencies.  Here  we  have  to  show  how 
the  changes  of  gene  and  genotype  frequencies  are  expected  to  affect 
metric  characters.  And  at  the  same  time  we  have  to  consider  the 
observed  consequences  of  inbreeding  and  crossing,  and  see  what 
light  they  throw  on  the  properties  of  the  genes  concerned  with 
metric  characters.  We  shall  first  consider  the  changes  of  mean  value 
and  then,  in  the  next  chapter,  the  changes  of  variance  resulting  from 
inbreeding  and  crossbreeding.   Finally,  in  Chapter  16,  we  shall  con- 
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sider  the  combination  of  selection  with  inbreeding  and  crossbreeding 
by  means  of  which  hybrid  vigour  may  be  utilised  in  animal  and  plant 
improvement. 

Inbreeding  Depression 

The  most  striking  observed  consequence  of  inbreeding  is  the 
reduction  of  the  mean  phenotypic  value  shown  by  characters  con- 
nected with  reproductive  capacity  or  physiological  efficiency,  the 
phenomenon  known  as  inbreeding  depression.  Some  examples  of  in- 
breeding depression  are  given  in  Table  14.  i,  from  which  one  can  see 
what  sort  of  characters  are  subject  to  inbreeding  depression,  and — 
very  roughly — the  magnitude  of  the  effect.  From  the  results  of  these 
and  many  other  studies  we  can  make  the  generalisation  that  inbreed- 
ing tends  to  reduce  fitness.  Thus,  characters  that  form  an  important 
component  of  fitness,  such  as  litter  size  or  lactation  in  mammals, 
show  a  reduction  on  inbreeding;  whereas  characters  that  contribute 
little  to  fitness,  such  as  bristle  number  in  Drosophila,  show  little  or  no 
change. 

In  saying  that  a  certain  character  shows  inbreeding  depression, 
we  refer  to  the  average  change  of  mean  value  in  a  number  of  lines. 
The  separate  lines  are  commonly  found  to  differ  to  a  greater  or  lesser 
extent  in  the  change  they  show,  as,  indeed,  we  should  expect  in 
consequence  of  random  drift  of  gene  frequencies.  This  matter  of  dif- 
ferentiation of  lines  will  be  discussed  later  when  we  deal  with  changes 
of  variance.  It  is  mentioned  here  only  to  emphasise  the  fact  that  the 
changes  of  mean  value  now  to  be  discussed  refer  to  changes  of  the 
mean  value  of  a  number  of  lines  derived  from  one  base  population. 
As  in  our  earlier  account  of  inbreeding  we  have  to  picture  the  "whole 
population"  consisting  of  many  lines.  The  population  mean  then 
refers  to  the  whole  population  and  inbreeding  depression  refers  to  a 
reduction  of  this  population  mean.  Let  us  now  consider  the  theoreti- 
cal basis  of  the  change  of  population  mean  on  inbreeding. 

First,  we  may  recall  and  extend  some  of  the  conclusions  from 
Chapter  3,  supposing  at  first  that  selection  does  not  in  any  way  inter- 
fere with  the  dispersion  of  gene  frequencies.  Since  the  gene  fre- 
quencies in  the  population  as  a  whole  do  not  change  on  inbreeding, 
any  change  of  the  population  mean  must  be  atrributed  to  the  changes 
of  genotype  frequencies.  Inbreeding  causes  an  increase  in  the  frequen- 
cies of  homozygous  genotypes  and  a  decrease  of  heterozygous  genotypes. 
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Table  14.  i 

Some  Examples  of  Inbreeding  Depression 

The  figures  given  show  approximately  the  decrease  of  mean 
phenotypic  value  per  10  per  cent  increase  of  the  coefficient 
of  inbreeding:  column  (1)  in  absolute  units;  column  (2)  as 
percentage  of  non-inbred  mean;  column  (3)  in  terms  of  the 
original  phenotypic  standard  deviation  (data  not  available 
for  all  characters). 

Character  Inbreeding  depression  per 

10%  increase  ofF 

to         (2) 

units  % 

Cattle  (A.  Robertson,  1954) 

Milk-yield  29-6    gal.  3-2 


Pigs  (Dickerson  et  al.  1954) 


(3) 
/ap 


0-17 


Litter  size  at  birth 
Weight  at  154  days 

0*38  young 
3-64  lb. 

4.6 

27 

0-15 

0*12 

Sheep  (Morley,  1954) 

Fleece  weight 
Length  of  wool 
Body  weight  at  1  year 

0-64  lb. 
o-i2  cm. 
2-91  lb. 

5'5 
i-3 

37 

0-51 
CI4 
C36 

Poultry  (Shoffner,  1948) 

Egg-production 
Hatchability 
Body  weight 

9-26  eggs 

4-36% 
0-04  lb. 

6-2 

6.4 
o-8 



Mice  (Original  data) 

Litter  size  at  birth 

o*6o  young 

8-o 

0-28 

Weight  at  6  weeks  ($?) 

0-58  gm. 

2-6 

0-26 

Drosophila  melanogaster 

(Tantawy  and  Reeve,  1956) 
Fertility  (per  pair  per  day) 
Viability  (egg  to  adult) 
Wing  length 

2-2    offspring 

2-6    % 

2'8    (too)  mm. 

67 

37 
i-4 

o-8o 

Drosophila  subobscura 

(Hollingsworth  and  Smith,  1955) 
Fertility  (per  pair  per  day) 
Egg  hatchability 

6-o    offspring 
8-3    % 

I2'5 

8'3 

— 
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Therefore  a  change  of  population  mean  on  inbreeding  must  be  con- 
nected with  a  difference  of  genotypic  value  between  homozygotes  and 
heterozygotes.  Let  us  now  see  more  precisely  how  the  population 
mean  depends  on  the  degree  of  inbreeding,  which  we  may  con- 
veniently express  as  the  inbreeding  coefficient,  F.  \ 

Consider  a  population,  subdivided  into  a  number  of  lines,  with  a 
coefficient  of  inbreeding,  F.  The  expression  for  the  population  mean 
is  derived  by  putting  together  the  reasoning  set  out  in  Tables  3.1  and 
7.1,  in  the  following  way.  Table  14.2  shows  the  three  genotypes  of  a 
two-allele  locus  with  their  genotypic  frequencies  in  the  whole  popula- 
tion. These  frequencies  come  from  Table  3.1,  p  and  q  being  the 
gene  frequencies  in  the  whole  population.  Then  the  third  column 
gives  the  genotypic  values  assigned  as  in  Fig.  7.1.    The  value  and 


Table  14.2 

Genotype     Frequency        Value  Frequency  x  Value 

A^          p+pqF            +a  p2a+pqaF 

A^        2pq-2pqF             d  2pqd-2pqdF 

A2A2          q2+pqF            -a  -q2a-pqaF 


I 


Sum  =  a(p  -q)  +  2dpq  -  2dpqF 
=  a(p  -q)  +  2dpq(i  -F) 


frequency  of  each  genotype  are  multiplied  together  in  the  right-hand 
column,  the  summation  of  which  gives  the  contribution  of  this  locus      a 
to  the  population  mean.   Thus,  referring  still  to  the  effects  of  a  single 
locus,  we  find  that  a  population  with  inbreeding  coefficient  F  has  a 
mean  genotypic  value: 


MF  =  a{p-q)  +  2dpq(i-F)  (14.1) 

=  M0-zdpqF  (14.2) 

where  M0  is  the  population  mean  before  inbreeding,  from  equation 
7.2.  The  change  of  mean  resulting  from  inbreeding  is  therefore 
—  2dpqF.  This  shows  that  a  locus  will  contribute  to  a  change  of  mean 
value  on  inbreeding  only  if  d  is  not  zero;  in  other  words  if  the  value 
of  the  heterozygote  differs  from  the  average  value  of  the  homozygotes.  ^ 
This  conclusion,  though  demonstrated  in  detail  only  for  two  alleles  ^ 
at  a  locus,  is  equally  valid  for  loci  with  more  than  two  alleles.  The 
following  general  conclusions  can  therefore  be  drawn:  that  a  change 
of  mean  value  on  inbreeding  is  a  consequence  of  dominance  at  the 
loci  concerned  with  the  character,  and  that  the  direction  of  the  change 
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is  toward  the  value  of  the  more  recessive  alleles.  The  dominance  may 
be  partial  or  complete,  or  it  may  be  overdominance;  all  that  is  neces- 
sary for  a  locus  to  contribute  to  a  change  of  mean  is  that  the  heterozy- 
gote  should  not  be  exactly  intermediate  between  the  two  homozygotes. 
Equation  14.2  shows  also  that  the  magnitude  of  the  change  of  mean 
depends  on  the  gene  frequencies.  It  is  greatest  when  pq  is  maximal: 
that  is,  when  j>=<7  =  |.  Genes  at  intermediate  frequencies  therefore 
contribute  more  to  a  change  of  mean  than  genes  at  high  or  low  fre- 
quencies, other  things  being  equal. 

Now  let  us  consider  the  combined  effect  of  all  the  loci  that  affect 
the  character.  In  so  far  as  the  genotypic  values  of  the  loci  combine 
additively,  the  population  mean  is  given  by  summation  of  the  contri- 
butions of  the  separate  loci,  thus: 


MF=Za{p  -  q)  +  2{Zdpq)(i  -F) 
=  M0-2FZdpq 


(14-3) 


and  the  change  of  mean  on  inbreeding  is  -  zFZdpq. 

These  expressions  show  what  are  the  circumstances  under  which 
a  metric  character  will  show  a  change  of  mean  value  on  inbreeding. 
The  chief  one  is  if  the  dominance  of  the  genes  concerned  is  pre- 
ponderantly in  one  direction;  i.e.  if  there  is  directional  dominance. 
If  the  genes  that  increase  the  value  of  the  character  are  dominant 
over  their  alleles  that  reduce  the  value,  then  inbreeding  will  result  in 
a  reduction  of  the  population  mean,  i.e.  a  change  in  the  direction  of 
the  more  recessive  alleles.  The  contribution  of  each  locus,  however, 
depends  also  on  its  gene  frequencies,  those  with  intermediate  fre- 
quencies having  the  greatest  effect  on  the  change  of  mean  value. 

We  have  now  reached  two  conclusions  about  the  effects  of  in- 
breeding, one  from  observation — that  inbreeding  reduces  fitness;  the 
other  from  theory — that  the  change  is  in  the  direction  of  the  more 
recessive  alleles.  Putting  these  two  conclusions  together  leads  to  the 
generalisation,  already  familiar  from  Mendelian  genetics,  that  dele- 
terious alleles  tend  to  be  recessive. 

Another  conclusion  that  can  be  drawn  from  equation  14.4  is  that 
when  loci  combine  additively  the  change  of  mean  on  inbreeding 
should  be  directly  proportional  to  the  coefficient  of  inbreeding.  In 
other  words  the  change  of  mean  should  be  a  straight  line  when 
plotted  against  F.  Two  examples  of  experimentally  observed  inbreed- 
ing depression  are  illustrated  in  Fig.  14.1. 

On  the  whole  the  observed  inbreeding  depression  does  tend  to  be 
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linear  with  respect  to  F,  and  this  might  be  taken  as  evidence  that 
epistatic  interaction  between  loci  is  not  of  great  importance.  There 
are,  however,  several  practical  difficulties  that  stand  in  the  way  of 
drawing  firm  conclusions  from  observations  of  the  rate  of  inbreeding 
depression.  One  is  that  as  inbreeding  proceeds  and  reproductive 
capacity  deteriorates,  it  soon  becomes  impossible  to  avoid  the  loss  of 


Fig.  14.  i.  Examples  of  inbreeding  depression  affecting  fertility. 
(a)  Litter-size  in  mice  (original  data).  Mean  number  born  alive  in 
1  st  litters,  plotted  against  the  coefficient  of  inbreeding  of  the  litters. 
The  first  generation  was  by  double-first-cousin  mating;  thereafter 
by  full-sib  mating.  No  selection  was  practised,  (b)  Fertility  in 
Drosophila  subobscura.  Mean  number  of  adult  progeny  per  pair  per 
day,  plotted  against  the  inbreeding  coefficient  of  the  parents. 
Consecutive  full-sib  matings.  (Redrawn  from  Hollingsworth  & 
Smith,  I955-) 

some  lines.  The  survivors  are  then  a  selected  group  to  which  the 
theoretical  expectations  no  longer  apply.  Thus  precise  measurement 
of  the  rate  of  inbreeding  depression  can  generally  be  made  only  over 
the  early  stages,  before  the  inbreeding  coefficient  reaches  high  levels. 
Another  difficulty,  met  with  particularly  in  the  study  of  mammals, 
arises  from  maternal  effects.  Maternal  qualities  are  among  the  most 
sensitive  characters  to  inbreeding  depression.  The  effect  of  inbreed- 
ing on  another  character  that  is  influenced  by  maternal  effects  is 
therefore  two-fold:  part  being  attributable  to  the  inbreeding  of  the 
individuals  measured  and  part  to  the  inbreeding  in  the  mothers.  So 
the  relationship  between  the  character  measured  and  the  coefficient 
of  inbreeding  cannot  be  depicted  in  any  simple  manner.    In  conse- 
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quence  of  these  difficulties  reliable  conclusions  cannot  easily  be 
drawn  from  the  exact  form  of  the  inbreeding  depression  observed  in 
experiments. 

Example  14.  i.  The  complications  arising  from  maternal  effects  may 
be  illustrated  by  litter  size  in  pigs  and  mice.  Litter  size  is  a  composite 
character,  which  is  partly  an  attribute  of  the  mother  and  partly  an  attribute 
of  the  young  in  the  litter.  It  is  therefore  influenced  both  by  the  inbreeding 
of  the  mother  and  by  the  inbreeding  of  the  young,  and  these  two  influences 
are  difficult  to  disentangle  in  practice.  Studies  on  pigs  (Dickerson  et  al., 
1954)  have  shown  that  the  reduction  of  litter  size  due  to  inbreeding  in  the 
mother  alone  is  about  0-20  young  per  10  per  cent  of  inbreeding;  and  the 
reduction  due  to  inbreeding  in  the  young  alone  is  about  0-17  young  per  10 
per  cent  of  inbreeding.  Thus  the  effects  of  inbreeding  in  the  mother  and  in 
the  young  are  about  equally  important.  A  small  experiment  with  mice 
(original  data)  gave  much  the  same  picture.  A  rough  separation  of  the 
effects  of  inbreeding  in  the  mother  and  in  the  young  was  made  by  means  of 
crosses  between  lines  after  2  or  3  generations  of  sib  mating.  (The  justifi- 
cation for  regarding  this  as  a  measure  of  the  inbreeding  depression  will  be 
explained  in  the  next  section.)  The  mean  litter  sizes,  arranged  according 
to  the  coefficient  of  inbreeding  of  the  mothers  and  of  the  young,  are  given 
in  the  table. 

Inbreeding  coefficient  of  mothers 

0%  37'5%  50% 


0% 
50% 
59% 


8-2 


7'5 
6-3 


7'3 
5-8 


The  three  comparisons  in  the  first  row  show  the  effect  of  inbreeding  in  the 
mothers,  and  give  values  of  0-19,  0-18  and  0-16  for  the  reduction  of  litter 
size  per  10  per  cent  of  inbreeding.  The  comparisons  in  the  second  and 
third  column  show  the  effect  of  inbreeding  in  the  young,  and  give  values 
of  0-24  and  0-25  for  the  reduction  per  10  per  cent  of  inbreeding.  Thus 
inbreeding  in  the  young  had  rather  more  effect  than  inbreeding  in  the 
mother.  These  results,  however,  should  not  be  taken  as  being  character- 
istic of  mice  in  general. 


The  effect  of  selection.  The  neglect  of  selection  during  in- 
breeding is  an  unrealistic  omission  because  natural  selection  cannot 
be  wholly  avoided  even  in  laboratory  experiments.  Since  inbreeding 
tends  to  reduce  fitness,  natural  selection  is  likely  to  oppose  the  in- 
breeding process  by  favouring  the  least  homozygous  individuals. 
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The  balance  between  selection  and  the  dispersion  of  gene  frequencies 
was  discussed  in  Chapter  4,  and  the  only  further  point  that  need  be 
added  here  is  that  the  operation  of  natural  selection  makes  the  in- 
breeding depression  dependent  on  the  rate  of  inbreeding.  One  must 
distinguish  between  the  state  of  dispersion  of  gene  frequencies  and 
the  coefficient  of  inbreeding  as  computed  from  the  population  size  or 
the  pedigree  relationships.  The  state  of  dispersion  is  what  determines 
the  amount  of  inbreeding  depression;  the  coefficient  of  inbreeding  is  a 
measure  of  the  state  of  dispersion  only  in  the  absence  of  selection. 
When  selection  operates,  the  state  of  dispersion  will  be  less  than  that 
indicated  by  the  coefficient  of  inbreeding,  and  the  discrepancy  be- 
tween the  two  will  be  greater  when  the  rate  of  inbreeding  is  slower, 
because  the  selection  will  then  be  relatively  more  potent.  Therefore 
one  must  expect  the  inbreeding  depression  caused  by  a  given  increase 
of  the  computed  coefficient  of  inbreeding  to  be  less  when  inbreeding 
is  slow  than  when  it  is  rapid. 


Heterosis 

Complementary  to  the  phenomenon  of  inbreeding  depression  is 
its  opposite,  "hybrid  vigour"  or  heterosis.  When  inbred  lines  are 
crossed,  the  progeny  show  an  increase  of  those  characters  that  previ- 
ously suffered  a  reduction  from  inbreeding.  Or,  in  general  terms,  the 
fitness  lost  on  inbreeding  tends  to  be  restored  on  crossing.  That  the 
phenomenon  of  heterosis  is  simply  inbreeding  depression  in  reverse 
can  be  seen  by  consideration  of  how  the  population  mean  depends  on 
the  coefficient  of  inbreeding,  as  shown  in  equation  14.4.  Consider,  as 
before,  a  population  subdivided  into  a  number  of  lines.  If  the  lines 
are  crossed  at  random,  the  average  coefficient  of  inbreeding  in  the 
cross-bred  progeny  reverts  to  that  of  the  base  population.  Thus,  if  a 
number  of  crosses  are  made  at  random  between  the  lines,  the  mean 
value  of  any  character  in  the  cross-bred  progeny  is  expected  to  be  the 
same  as  the  population  mean  of  the  base  population.  In  other  words, 
the  heterosis  on  crossing  is  expected  to  be  equal  to  the  depression  on 
inbreeding.  Furthermore,  if  the  population  is  continued  after  the 
crossing  by  random  mating  among  the  cross-bred  and  subsequent 
generations,  the  coefficient  of  inbreeding  will  remain  unchanged,  and 
the  population  mean  is  consequently  expected  to  remain  at  the  level 
of  the  base  population.  We  may,  thus,  make  the  following  generalisa- 
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tion  on  theoretical  grounds:  that,  in  the  absence  of  selection,  in- 
breeding followed  by  crossing  of  the  lines  in  a  large  population  is  not 
expected  to  make  any  permanent  change  in  the  population  mean. 

Example  14.2.  An  experiment  with  mice  (R.  C.  Roberts,  unpublished) 
was  designed  to  test  the  theoretical  expectation  that  in  the  absence  of 
selection  the  heterosis  on  crossing  should  be  equal  to  the  depression  on 
inbreeding.  The  character  studied  was  litter  size.  Thirty  lines  taken  from 
a  random-bred  population  were  inbred  by  3  consecutive  generations  of 
full-sib  mating,  bringing  the  coefficient  of  inbreeding  up  to  50  per  cent  in 
the  litters  and  37-5  per  cent  in  the  mothers.  No  selection  was  practised 
during  the  inbreeding,  and  only  2  of  the  30  lines  were  lost  as  a  conse- 
quence of  their  inbreeding  depression. 

Litter  size 
Before  inbreeding  8-i 

Inbred  (litters:  F  =  50%)  57 

Cross-bred  8-5 

After  the  third  generation  of  inbreeding,  crosses  were  made  at  random 
between  the  lines,  and  in  the  next  generation  crosses  between  the  F/s  were 
made  so  as  to  give  cross-bred  mothers  with  non-inbred  young.  The  mean 
litter  sizes  observed  at  the  different  stages  are  given  in  the  table.  The 
inbreeding  depression  was  2-4  and  the  heterosis  2-8;  the  two  are  equal 
within  the  limits  of  experimental  error. 


Single  crosses.  The  foregoing  theoretical  conclusions  refer  to 
the  average  of  a  large  number  of  crosses  between  lines  derived  from  a 
single  base  population.  In  practice,  however,  one  is  often  interested 
in  a  somewhat  different  problem,  namely  the  heterosis  shown  by  a 
particular  cross  between  two  lines,  or  between  two  populations  which 
may  have  no  known  common  origin.  To  refer  the  changes  of  mean 
value  to  changes  of  inbreeding  coefficient  would  be  inappropriate 
under  these  circumstances,  and  the  theoretical  basis  of  the  heterosis  is 
better  expressed  in  terms  of  the  gene  frequencies  in  the  two  lines. 
We  may  recall  from  Chapter  3  that  inbreeding  leads  to  a  dispersion  of 
gene  frequencies  among  the  lines,  the  lines  becoming  differentiated 
in  gene  frequency  as  inbreeding  proceeds;  and  the  coefficient  of 
inbreeding  is  a  means  of  expressing  the  degree  of  differentiation 
(equation  3.14).  In  turning  from  the  inbreeding  coefficient  to  the 
gene  frequencies  as  a  basis  for  discussion  we  are  therefore  turning 
from  the  general,  or  average,  consequence  of  crossing,  to  the  particu- 
lar circumstances  in  two  lines. 
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Let  us,  then,  consider  two  populations,  referred  to  as  the  '  'parent 
populations,"  both  random-bred  though  not  necessarily  large.  The 
parent  populations  are  crossed  to  produce  an  Fx  or  "first  cross-bred 
generation,"  and  the  Fx  individuals  are  mated  together  at  random  to 
produce  an  F2  or  "second  cross-bred  generation."  The  amount  of 
heterosis  shown  by  the  Fx  or  the  F2  will  be  measured  as  the  deviation 
from  the  mid-parent  value,  i.e.  as  the  difference  from  the  mean  of  the 
two  parent  populations.  First  consider  the  effects  of  a  single  locus 
with  two  alleles  whose  frequencies  are  p  and  q  in  one  population,  and 
p'  and  q'  in  the  other.  Let  the  difference  of  gene  frequency  between 
the  two  populations  be  y,  so  that  y=p-p' =q'  -q.  The  algebra  is 
then  simplified  by  writing  the  gene  frequencies^/  and  q'  in  the  second 
population  as  (p  -y)  and  (q  +y).  Let  the  genotypic  values  be  a,  d,  -  a, 
as  before.  They  are  assumed  to  be  the  same  in  the  two  popula- 
tions, epistatic  interaction  being  disregarded.  We  have  to  find  the 
mean  of  each  parent  population  and  the  mid-parent  value;  then  the 
mean  of  the  Fx  and  the  mean  of  the  F2.  The  parental  means,  MVl  and  » 
Mp2,  are  found  from  equation  7.2.  They  are 

M1>1=a(p-q)  +  2dpq 

Mj>2  =  a{p-y-q-y)  +  zd(p  -y)(q  +y) 
=  a(p-q-  2y)  +  zd[pq  +y(p  -q)- y2] 

The  mid-parent  value  is 

Mp  =  «MPi+Mp2) 

=  a(p-q-y)  +  d[2pq+y{p-q)-y*\       (14.5) 

When  the  two  populations  are  crossed  to  produce  the  Flf  indi- 
viduals taken  at  random  from  one  population  are  mated  to  indivi- 

Table  14.3 
Frequencies  of  Zygotes  in  the  F1 

Gametes  from  P1 
Aj  A2 

P  Q 


Gametes  \     A1    p-y 
from¥2    J      A2     q+y 


p(p-y)      q(p-y) 

p(q+y)    q{<i+y) 


duals  taken  at  random  from  the  other  population.  This  is  equivalent 
to  taking  genes  at  random  from  the  two  populations,  as  shown  in 
Table  14.3.  The  Fx  is  therefore  constituted  as  follows:  I  ence 
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Genotypes 
Frequencies 
Genotypic  values 


p(p-y) 
a 


AiA2 

2pq+y(p-q) 
d 


A2A2 

q(q+y) 

-a 


The  mean  genotypic  value  of  the  Fx  is  therefore: 

M¥i  =  a(p2  -py-q2-qy)  +  d[2pq+y(p-q)] 
=  a{p-q-y)  +  d[zpq  +y(p  -  q)] 

The  amount  of  heterosis,  expressed  as  the  difference  between  the  F1 
and  the  mid-parent  values,  is  obtained  by  subtracting  equation  14.5 
from  equation  14.6: 


■(14.6) 


HFl=MFl-Mp 
=  dy* 


(14-7) 


Thus  heterosis,  just  like  inbreeding  depression,  depends  for  its  occur- 
rence on  dominance.  Loci  without  dominance  (i.e.  loci  for  which 
d=6)  cause  neither  inbreeding  depression  nor  heterosis.  The  amount 
of  heterosis  following  a  cross  between  two  particular  lines  or  popula- 
tions depends  on  the  square  of  the  difference  of  gene  frequency  (y) 
between  the  populations.  If  the  populations  crossed  do  not  differ  in 
gene  frequency  there  will  be  no  heterosis,  and  the  heterosis  will  be 
greatest  when  one  allele  is  fixed  in  one  population  and  the  other  allele 
in  the  other  population. 

Now  consider  the  joint  effects  of  all  loci  at  which  the  two  parent 
populations  differ.  In  so  far  as  the  genotypic  values  attributable  to 
the  separate  loci  combine  additively,  we  may  represent  the  heterosis 
produced  by  the  joint  effects  of  all  the  loci  as  the  sum  of  their  separate 
contributions.  Thus  the  heterosis  in  the  F1  is 


HVl=Zdy* 


(14.8) 


If  some  loci  are  dominant  in  one  direction  and  some  in  the  other  their 
effects  will  tend  to  cancel  out,  and  no  heterosis  may  be  observed,  in 
spite  of  the  dominance  at  the  individual  loci.  The  occurrence  of 
heterosis  on  crossing  is  therefore,  like  inbreeding  depression,  de- 
pendent on  directional  dominance,  and  the  absence  of  heterosis  is  not 
sufficient  ground  for  concluding  that  the  individual  loci  show  no 
dominance. 

Before  we  go  on  to  consider  the  F2  it  is  perhaps  worth  noting  that 
the  formulation  of  the  heterosis  in  terms  of  the  square  of  the  differ- 
ence of  gene  frequency,  in  equations  J4.7  and  14.8,  is  quite  in  line 
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with  the  previous  formulation  of  the  inbreeding  depression  in  terms 
of  the  coefficient  of  inbreeding.  If  we  envisage  once  more  the  whole 
population  subdivided  into  lines,  and  we  suppose  pairs  of  lines  to  be 
taken  at  random,  then  the  mean  squared  difference  of  gene  frequency 
between  the  pairs  of  lines  will  be  equal  to  twice  the  variance  of  gene 
frequency  among  the  lines.  That  is:  (j2)  =  2o^.  And,  by  equation 
3.14,  2o\  =  2pqF.  Therefore  the  mean  amount  of  heterosis  shown  by 
crosses  between  random  pairs  of  lines  is  equal  to  the  inbreeding 
depression  as  given  in  equation  14.2,  though  of  opposite  sign. 

Now  let  us  consider  the  F2  of  a  particular  cross  of  two  parent 
populations,  the  F2  being  made  by  random  mating  among  the  indi- 
viduals of  the  Fj.  In  consequence  of  the  random  mating,  the  geno- 
type frequencies  in  the  F2  will  be  the  Hardy- Weinberg  frequencies 
corresponding  to  the  gene  frequency  in  the  Fv  The  mean  genotypic 
value  of  the  F2  is  then  easily  derived  by  application  of  equation  7.2. 
The  gene  frequency  in  the  F1}  being  the  mean  of  the  gene  frequencies 
in  the  two  parent  populations,  is  (p  -  \y)  for  one  allele,  and  (q  +  \y) 
for  the  other.  Putting  these  gene  frequencies  in  place  of  p  and  q 
respectively  in  equation  7.2  gives  the  mean  genotypic  value  of  the 


2  as: 

MVi  =  a(p-iy-q-ly)  +  2d(p-iy)(q  +  iy) 

=  a(p-q-y)  +  d[zpq+y(p-q)-iy2]         

The  amount  of  heterosis  shown  by  the  F2  is  the  difference  between 
the  F2  and  mid-parent  values.  So,  from  equations  14.5  and  X4.9, 

=  \dy* 

=i#Fx  {14-1°) 

We  find  therefore  that  the  heterosis  shown  by  the  F2  is  only  half  as 
great  as  that  shown  by  the  Fx.  In  other  words,  the  F2  is  expected  to 
drop  back  half-way  from  the  Fx  value  toward  the  mid-parent  value. 
At  first  sight  this  conclusion  may  seem  to  contradict  the  one  arrived 
at  earlier,  when  we  were  considering  crosses  between  many  lines,  the 
F]_  and  F2  means  then  being  equal.  The  difference  between  the  two 
situations  is  that  an  F2  made  by  random  mating  among  a  large  number 
of  different  crosses  has  the  same  inbreeding  coefficient  as  the  F2. 
But  an  F2  made  from  an  Fx  derived  from  a  single  cross  has  inevitably 
an  increased  inbreeding  coefficient.    If  the  inbreeding  coefficient  is 
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worked  out  in  the  manner  described  in  Example  5.2,  it  will  be  found 
to  be  half  the  inbreeding  coefficient  of  the  parent  lines.  The  change 
of  mean  from  Fx  to  F2  may  therefore  be  regarded  as  inbreeding  de- 
pression. It  cannot  be  overcome  by  having  a  large  number  of  parents 
of  the  F2  because  the  restriction  of  population  size  that  causes  the 
inbreeding  has  already  been  made  in  the  single  cross  of  only  two  lines, 
or  parent  populations.  There  need,  however,  be  no  further  rise  of  the 
inbreeding  coefficient  in  the  F3  and  subsequent  generations.  Pro- 
vided, therefore,  that  there  is  no  other  reason  for  the  gene  frequency 
to  change,  the  population  mean  will  be  the  same  in  the  generations 
following  as  in  the  F2. 

That  the  heterosis  expected  in  the  F2  is  half  that  found  in  the  F± 
is  equally  true  when  the  joint  effects  of  all  loci  are  considered,  pro- 
vided that  epistatic  interaction  is  absent.  The  conclusion  for  a  single 
locus  was  based  on  the  principle  that  Hardy- Weinberg  equilibrium 
is  attained  by  a  single  generation  of  random  mating.  It  will  be 
remembered  from  Chapter  1  (p.  19),  however,  that  this  is  not  true 
with  respect  to  genotypes  at  more  than  one  locus  considered  jointly. 
Therefore  if  there  is  epistatic  interaction,  the  population  mean  will 
not  reach  its  equilibrium  value  in  the  F2,  but  will  approach  it  more  or 
less  rapidly  according  to  the  number  of  interacting  loci  and  the 
closeness  of  the  linkage  between  them.  The  existence  of  epistatic 
interaction  is  intimately  connected  with  the  scale  of  measurement, 
but  this  matter  will  not  be  discussed  until  Chapter  17.  Here  we  need 
only  note  that  for  reasons  connected  with  the  scale  of  measurement 
the  halving  of  the  heterosis  in  the  F2  expected  on  theoretical  grounds 
is  not  often  found  at  all  exactly  in  practice,  though  the  F2  usually  falls 
somewhere  between  the  Fx  and  mid-parent  values.  Some  examples 
from  plants  of  the  heterosis  observed  in  the  F1  and  F2  generations  are 
illustrated  in  Fig.  14.2.  It  will  be  noticed  that  with  some  of  the 
characters  shown,  the  Fx  and  F2  are  lower  in  value  than  the  mid- 
parent,  and  the  heterosis  is  consequently  negative  in  sign.  This  is  in 
no  way  inconsistent  with  our  definition  of  heterosis  as  the  difference 
between  the  Fx  or  F2  and  the  mid-parent  value.  The  sign  of  the 
difference  depends  simply  on  the  nature  of  the  measurement.  For 
example,  the  character  "days  to  first  fruit,"  represented  in  the  lower 
graphs,  shows  heterosis  of  negative  sign:  but  if  the  character  were 
called  "speed  of  development"  and  expressed  as  a  reciprocal  of  time 
the  heterosis  would  be  positive  in  sign. 

The  relative  amount  of  heterosis  observed  in  the  Fx  and  F2 
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generations  is  complicated  also  by  the  existence  of  maternal  effects, 
particularly  in  mammals.    A  character  subject  to  a  maternal  effect, 
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Fig.  14.2.  Some  illustrations  of  heterosis  observed  in  crosses 
between  pairs  of  highly  inbred  strains  of  plants.  The  points  show 
the  mean  values  of  the  two  parent  strains,  the  Fx  and  the  F2 
generations.  The  mid-parent  values  are  shown  by  horizontal  lines. 
Graph  (a)  refers  to  tobacco,  Nicotiana  rustica  (data  from  Smith, 
1952).  All  the  other  graphs  refer  to  tomatoes,  Ly coper sicon  (Data 
from  Powers,  1952).    The  characters  represented  are: 

(a)  Height  of  plant  (in.) 

(b)  Mean  weight  of  one  fruit  (gm.) 

(c)  Number  of  locules  per  fruit 
{d)  Mean  weight  per  locule  (gm.) 

(e)-(h)  Mean  time  in  days  between  the  planting  of  the  seed  and 
the  ripening  of  the  first  fruit,  in  4  different  crosses. 

such  as  litter  size,  is  divided  between  two  generations.  The  maternally 
determined  component  of  the  character  may  be  expected  to  follow  the 
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same  general  pattern  of  heterosis  in  the  F±  and  F2  as  we  have  just 
discussed,  but  it  will  be  one  generation  out  of  phase  with  the  non- 
maternal  part  of  the  character.  Thus  the  heterosis  observed  in  the  F1 
is  attributable  to  the  non-maternal  part,  the  maternal  effect  being  still 
at  the  inbred  level.  In  the  F2,  however,  the  non-maternal  part  will 
lose  half  the  heterosis  as  explained  above,  but  the  maternal  effect  will 
now  show  the  full  effect  of  its  heterosis  since  the  mothers  are  now  in 
the  Fj  stage.  This  rather  complicated  situation  may  perhaps  be  more 
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Fig.  14.3.  Diagram  of  the  heterosis  expected  in  a  character  sub- 
ject to  a  maternal  effect,  when  two  lines  are  crossed  and  the  F2  is 
made  by  random  mating  among  the  Fx.  The  maternal  and  non- 
maternal  components  of  the  character  separately  are  here  supposed 
to  show  equal  amounts  of  heterosis,  and  to  combine  by  simple 
addition  to  give  the  character  as  it  is  measured. 

readily  grasped  from  the  diagrammatic  representation  in  Fig.  14.3. 
As  a  result  of  maternal  effects,  therefore,  the  loss  of  heterosis  in  the 
F2  and  subsequent  generations  is  usually  less  noticeable  with  animals 
than  with  plants,  and  experiments  of  great  precision  would  be  re- 
quired to  detect  any  regular  pattern. 

Wide  crosses.  We  have  seen  that  the  amount  of  heterosis  shown 
by  a  particular  cross  depends,  among  other  things,  on  the  differences 
of  gene  frequency  between  the  two  populations  crossed.  This  would 
seem  to  indicate  that  the  amount  of  heterosis  would  increase  with  the 
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degree  of  genetic  differentiation  between  the  two  populations  and 
would  be  limited  only  by  the  barrier  of  interspecific  sterility.  This, 
however,  is  not  true.  Crosses  between  subspecies,  or  between  local 
races,  taken  from  the  wild  often  fail  to  show  heterosis,  particularly 
in  characters  closely  related  to  fitness  which  show  heterosis  in  crosses 
between  less  differentiated  laboratory  populations.  Indeed  the  F^s 
of  wide  crosses  are  often  less  fit  than  the  parent  populations.  Much 
of  the  evidence  about  such  crosses  comes  from  studies  of  wild 
populations  of  Drosophila  pseudoobscura  and  other  species,  (see 
Dobzhansky,  1950;  Wallace  and  Vetukhiv,  1955).  Though  wide 
crosses  may  not  show  heterosis  in  fitness,  they  do  often  show  hetero- 
sis in  certain  characters,  particularly  growth  rate  in  plants.  Dob- 
zhansky (1950,  1952),  who  drew  attention  to  this,  refers  to  heterosis 
in  fitness  as  "euheterosis"  and  to  heterosis  in  a  character  that  does  not 
confer  greater  fitness  as  "luxuriance." 

The  error  in  extending  our  earlier  conclusion  to  wide  crosses 
arises  from  the  fact  that  we  have  assumed  epistatic  interaction  be- 
tween loci  to  be  negligible,  an  assumption  that  is  probably  justified 
for  crosses  between  breeds  of  domestic  animals  or  between  laboratory 
populations,  but  is  obviously  not  justified  in  the  case  of  crosses  be- 
tween differentiated  wild  populations.  The  existing  genetic  differen- 
tiation between  wild  populations  has,  for  the  most  part,  arisen  by 
evolutionary  adaptation  to  the  local  conditions.  Adaptation  to  local 
conditions  or  to  a  particular  way  of  life  involves  many  different 
characters,  both  structural  and  functional,  because  the  fitness  of  the 
organism  depends  on  the  harmonious  interrelations  of  all  its  parts. 
If  two  populations  adapted  to  different  ways  of  life  are  crossed,  the 
cross-bred  individuals  will  be  adapted  to  neither,  and  will  conse- 
quently be  less  fit  than  either  of  the  parent  populations.  The  effect 
of  this  evolutionary  adaptation  on  the  genetic  structure  of  the  popu- 
lations is  as  follows.  The  genes  Ax  and  B1}  say,  are  selected  in  one 
population  because  together  they  increase  fitness,  though  either  one 
separately  may  not;  while,  in  another  population  living  under  differ- 
ent conditions,  the  genes  A2  and  B2  are  selected  for  similar  reasons. 
In  respect  of  fitness,  therefore,  there  is  epistatic  interaction  between 
these  two  loci.  But  if  these  pairs  of  genes  become  fixed  throughout  the 
two  populations,  A±  and  B±  in  one  and  A2  and  B2  in  the  other,  and  so 
become  part  of  their  constant  genetic  structure,  the  variation  arising 
from  this  interaction  will  disappear.  Within  any  one  population, 
therefore,  we  may  find  very  little  epistatic  variation,  and  the  interac- 
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tion  will  become  apparent  as  a  cause  of  variation  between  individuals 
only  in  a  cross-bred  population  in  which  there  is  segregation  at  both 
interacting  loci. 

The  idea  that  the  genetic  structure  of  a  natural  population  evolves 
as  a  whole,  so  that  the  selection  pressure  on  any  one  locus  is  depend- 
ent on  the  alleles  present  at  many  of  the  other  loci,  is  expressed  in  the 
terms  "coadaptation"  and  "integration,"  used  to  describe  the  genetic 
structure  of  natural  populations.  (For  general  discussions  of  these 
concepts,  see  Dobzhansky,  195 ib;  Lerner,  1954,  1958;  Wright, 
1956.)  The  important  point  for  us  to  note  is  this.  The  property  of 
coadaptation,  or  integration,  assumes  primary  importance  only  when 
different  populations  are  to  be  compared  and  when  the  results  of 
crossing  adaptively  differentiated  populations  are  to  be  studied;  it  is 
of  less  importance  in  the  genetic  study  of  a  single  population.  In 
this  book  we  are  chiefly  concerned  with  the  genetic  variation  within  a 
population:  that  is,  the  variation  arising  from  the  segregation  of  genes 
in  the  population.  Some  of  this  variation  arises  from  epistatic  iner- 
action  between  the  genes  segregating  at  different  loci,  which  is  the 
raw  material,  as  it  were,  from  which  coadaptation  could  evolve  if  the 
population  were  to  become  subdivided.  But  the  amount  of  this  epi- 
static variation  within  a  population  is  probably  seldom  very  large, 
and  moreover  it  is  seldom  necessary  to  distinguish  it  from  other 
sources  of  non-additive  genetic  variance. 


F.Q.G. 


CHAPTER  15 

INBREEDING  AND  CROSSBREEDING: 

II.  Changes  of  Variance 

The  effect  of  inbreeding  on  the  genetic  variance  of  a  metric  character 
is  apparent,  in  its  general  nature,  from  the  description  of  the  changes 
of  gene  frequency  given  in  Chapter  3.  Again,  we  have  to  imagine  the 
whole  population,  consisting  of  many  lines.  Under  the  dispersive 
effect  of  inbreeding,  or  random  drift,  the  gene  frequencies  in  the 
separate  lines  tend  toward  the  extreme  values  of  o  or  1,  and  the  lines 
become  differentiated  in  gene  frequency.  Since  the  mean  genotypic 
value  of  a  metric  character  depends  on  the  gene  frequencies  at  the 
loci  affecting  it,  the  lines  become  differentiated,  or  drift  apart,  in 
mean  genotypic  value.  And,  since  the  genetic  components  of  vari- 
ance diminish  as  the  gene  frequencies  tend  toward  extreme  values 
(see  Fig.  8.1),  the  genetic  variance  within  the  lines  decreases.  The 
general  consequence  of  inbreeding,  therefore,  is  a  redistribution  of  the 
genetic  variance;  the  component  appearing  between  the  means  of 
lines  increases,  while  the  component  appearing  within  the  lines 
decreases.  In  other  words,  inbreeding  leads  to  genetic  differentiation 
between  lines  and  genetic  uniformity  within  lines.  The  differentia- 
tion is  illustrated  from  experimental  data  in  Fig.  15.1. 

The  subdivision  of  an  inbred  population  into  lines  introduces  an 
additional  observational  component  of  variance,  the  between-line 
component,  and  it  is  not  surprising  that  this  adds  a  considerable 
complication  to  the  theoretical  description  of  the  components  of 
genetic  variance.  Indeed,  a  full  theoretical  treatment  of  the  redistri- 
bution of  variance  has  not  yet  been  achieved.  Here  we  shall  attempt 
no  more  than  a  brief  description  of  the  main  outlines,  and  for  this  we 
shall  have  to  make  some  simplifications.  In  particular  we  shall 
entirely  neglect  the  interaction  component  of  genetic  variance  arising 
from  epistasis.  For  detailed  treatment  of  various  aspects  of  the 
problem,  and  for  references,  see  Kempthorne  (1957,  Ch.  17).  After 
this  description  of  the  redistribution  of  genetic  variance  we  shall 
consider  changes  of  environmental  variance.   The  greater  sensitivity 
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of  inbred  individuals  to  environmental  sources  of  variation  was 
mentioned  earlier,  in  Chapter  8.  This  phenomenon  interferes  with 
the  experimental  study  of  the  changes  of  variance,  and  until  it  is 
better  understood  we  cannot  put  much  reliance  on  the  theoretical 
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Fig.  15. i.  Differentiation  between  lines  by  random  drift,  shown 
by  abdominal  bristle  number  in  Drosophila  melanogaster.  The 
graphs  show  the  mean  bristle  number  in  each  of  10  lines  during 
full-sib  inbreeding  without  artificial  selection.  (From  Rasmuson, 
1952;  reproduced  by  courtesy  of  the  author  and  the  editor  of  Acta 
Zoological) 

expectations  concerning  variance  being  manifest  in  the  observable 
phenotypic  variance.  Finally,  in  this  chapter,  we  shall  discuss  the  use 
of  inbred  animals  for  experimental  purposes. 


Redistribution  of  Genetic  Variance 

The  redistribution  of  variance  arising  from   additive   genes   (i.e. 

genes  with  no  dominance)  is  easily  deduced.    This  is  because  with 

additive  genes  the  proportions  in  which  the  original  variance  is  dis- 
I  tributed  within  and  between  lines  does  not  depend  on  the  original 
I  gene  frequencies.  When  there  is  dominance,  however,  we  cannot 
j  deduce  the  changes  of  variance  without  a  knowledge  of  the  initial 
I  gene  frequencies.  This  not  only  adds  considerably  to  the  mathematical 

complexity,  but  it  renders  a  general  solution  impossible.    We  shall 
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first  consider  the  case  of  additive  genes,  and  then  very  briefly  indicate 
the  conclusions  arrived  at  for  dominant  genes.  The  effect  of  selection 
will  not  be  specifically  discussed.  We  need  only  note  that  natural 
selection  will  tend  to  render  the  actual  state  of  dispersion  of  gene 
frequencies  less  than  that  indicated  by  the  inbreeding  coefficient 
computed  from  the  population  size  or  pedigree  relationships.  There- 
fore we  must  expect  the  redistribution  of  genetic  variance  to  proceed 
at  a  slower  rate  than  the  theoretical  expectation,  and  we  must  expect 
the  discrepancy  to  be  greater  when  inbreeding  is  slow  than  when  it  is 
rapid. 

No  dominance.  What  follows  refers  to  the  variance  arising  from 
additive  genes:  it  does  not  apply  to  the  additive  variance  arising  from 
genes  with  dominance.  The  conclusions  therefore  apply,  strictly 
speaking,  only  to  characters  which  show  no  non-additive  variance. 
They  serve,  however,  to  indicate  the  general  effect  of  inbreeding  on 
variance,  and  may  be  taken  as  a  fair  approximation  to  what  is  expected 
of  characters  such  as  bristle  number  in  Drosophila,  that  show  little 
non-additive  genetic  variance.  The  description  to  be  given  refers  to 
slow  inbreeding,  and  is  not  strictly  true  of  rapid  inbreeding  by  sib- 
mating  or  self-fertilisation.  The  redistribution  of  the  variance  under 
rapid  inbreeding  is,  however,  not  very  different  except  in  the  first  few 
generations. 

Consider  first  a  single  locus.  When  there  is  no  dominance  the 
genotypic  variance  in  the  base  population,  given  in  equation  8.yi  be- 
comes 

VG  =  2pQqQa2 

The  variance  within  any  one  line  is 

VG  =  2pqa2 

where  p  and  q  are  the  gene  frequencies  in  that  line.  The  mean  vari- 
ance within  lines  is 

VGw  =  2(pq)a2 

where  (pq)  is  the  mean  value  of  pq  over  all  lines.  Now,  z(pq)  is  the 
overall  frequency  of  heterozygotes  in  the  whole  population,  which,  by 
Table  3.1,  is  equal  to  2p0q0(i  -F),  where  F  is  the  coefficient  of  in- 
breeding. Therefore 

VGw  =  2pQq0a2(i-F) 
=  V0(i-F) 
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and  this  remains  true  when  summation  of  the  variances  is  made  over 
all  loci.  Thus  the  within-line  variance  is  (i  -F)  times  the  original 
variance,  and  as  F  approaches  unity  the  within-line  variance  approaches 
zero. 

Now  let  us  consider  the  between-line  variance.  This  is  the  vari- 
ance of  the  true  means  of  lines,  and  would  be  estimated  from  an 
analysis  of  variance  as  the  between-line  component.  For  a  single 
locus,  still  with  no  dominance,  the  mean  genotypic  value  of  a  line 
with  gene  frequency^)  and  q  is  obtained  from  equation  7.2  as 

M=a{p-q) 
=  a{i-zq) 

Thus  we  want  to  find  the  variance  of  (a  -  zaq).  Now,  in  general, 
w\x-Y)  ~Gx  +  °r>  ^ X  and  Y  are  uncorrelated.  Since  in  this  case  a  is 
constant  from  line  to  line  (epistasis  being  assumed  absent)  it  has  no 
variance,  and  so 


Again,  in  general,  o^x 


K*<j'x  when  K  is  a  constant.  So 


°M 


=  ^a2p0q0F    (from  3.14) 
=zFVG 

and  this  also  remains  true  when  summation  is  made  over  all  loci. 
Thus  the  between-line  genetic  variance  is  zF  times  the  genetic  vari- 
ance in  the  base  population. 

The  partitioning  of  the  genetic  variance  into  components  as 
explained  above  is  summarised  in  Table  15.1.    The  total  genetic 

Table  15.  i 

Partitioning  of  the  variance  due  to  additive  genes  in  a 

population  with  inbreeding  coefficient  F,  when  the  variance 

due  to  additive  genes  in  the  base  population  is  VG. 

Between  lines  zFVG 

Within  lines  (i-F)VG 

Total  (1  +F)VG 

variance  in  the  whole  population  is  the  sum  of  the  within-line  and 
between-line  components,  and  is  equal  to  (1  +F)  times  the  original 
genetic  variance.  (This  is  true  also  of  close  inbreeding.)  Thus  when 
inbreeding  is  complete  the  genetic  variance  in  the  population  as  a 
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whole  is  doubled,  and  all  of  it  appears  as  the  between-line  component. 
The  genetic  variance  within  lines,  before  inbreeding  is  complete, 
is  partitioned  within  and  between  the  families  of  which  the  lines  are 
composed.  Under  slow  inbreeding  with  random  mating  within  the 
lines,  it  is  partitioned  equally  within  and  between  full-sib  families. 
The  covariance  of  relatives  within  the  lines  is  just  as  described  in 
Chapter  9,  each  line  being  a  separate  random-breeding  population 
with  a  total  genetic  variance  of  (1  -F)VG,  on  the  average.  From  this 
we  can  deduce  what  the  heritability  is  expected  to  be  within  any  one 
line.  It  will  be  (1  -F)VGj[(i  -F)VQ  +  VE\  and  this  reduces  to 

*-  x-Wt  (J5-J) 

where  h2t  and  Ft  are  the  heritability  within  lines  and  the  inbreeding 
coefficient  at  time  t,  and  h%  is  the  original  heritability  in  the  base 
population.  This  shows  how  the  heritability  is  expected  to  decline 
with  the  inbreeding  in  a  small  population.  The  formula,  however,  is 
applicable  only  to  characters  with  no  non-additive  variance,  and  in 
the  absence  of  selection.  The  operation  of  natural  selection  renders 
the  reduction  of  the  heritability  less  than  expected,  especially  under 
slow  inbreeding.  This  point  has  been  demonstrated  experimentally 
with  Drosophila  (Tantawy  and  Reeve,  1956). 

Dominance.  The  components  of  variance  arising  from  additive 
genes  will  have  been  seen  to  be  independent  of  the  gene  frequencies 
in  the  base  population.  When  we  consider  genes  with  any  degree  of 
dominance,  however,  we  find  that  the  changes  of  variance  on  in- 
breeding depend  on  the  initial  gene  frequencies,  and  this  makes  it 
impossible  to  give  a  general  solution  in  terms  of  the  genetic  variance 
present  in  the  base  population.  We  shall  therefore  do  no  more  than 
give  the  conclusions  arrived  at  by  A.  Robertson  (1952)  for  the  case  of 
fully  dominant  genes,  when  the  recessive  allele  is  at  low  frequency. 
This  is  the  situation  most  likely  to  apply  to  variation  in  fitness  arising 
from  deleterious  recessive  genes,  though  the  effects  of  selection  are 
here  disregarded.  Fig.  15.2  shows  the  redistribution  of  variance 
arising  from  recessive  genes  at  a  frequency  of  q  —  o-i  in  the  base 
population.  Fig.  15.2(a)  refers  to  full-sib  mating  with  only  one 
family  in  each  line,  and  Fig.  15.2(6)  refers  to  slow  inbreeding.  A 
surprising  feature  of  the  conclusions  is  that  the  within-line  variance 
at  first  increases,  reaching  a  maximum  when  the  coefficient  of  in- 
breeding is  a  little  under  0-5,  and  it  remains  at  a  fairly  high  level  until 
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the  coefficient  of  inbreeding  approaches  I.  The  reason,  in  general 
terms,  for  the  apparent  anomaly  that  the  variation  within  lines  in- 
creases during  the  first  stages  of  inbreeding,  can  be  seen  from  a  con- 
sideration of  the  relationship  between  the  gene  frequency  and  the 
variance  arising  from  a  dominant  gene  shown  in  Fig.  8.1(b).  The 
gene  frequency  is  taken  to  start  at  a  value  of  o-i,  and  on  inbreeding  it 


0  5 

GENERATIONS  OF  INBREEDING 


INBREEDING    COEFFICIENT 


Fig.  15.2.  Redistribution  of  variance  arising  from  a  single  fully 
recessive  gene  with  initial  frequency  q0  =o*i.  (a)  with  full-sib 
mating,  (b)  with  slow  inbreeding.  (From  A.  Robertson,  1952; 
reproduced  by  courtesy  of  the  author  and  the  editor  of  Genetics.) 

Vt  —total  genetic  variance. 

Vb  =  between-line  component. 

Vw  =within-line  component. 

Va  =  additive  genetic  variance  within  lines. 

will  increase  in  some  lines  and  decrease  in  others,  the  increase  being 
on  the  average  equal  in  amount  to  the  decrease.  But  examination  of 
the  graph  shows  that  an  increase  of  gene  frequency  by  a  certain 
amount  will  increase  the  variance  more  than  a  decrease  of  the  same 
amount  will  reduce  it.  Therefore,  on  the  average,  the  variance  within 
the  lines  will  increase  in  the  early  stages  of  inbreeding.  This  increase 
of  variance  would  be  detectable  in  practice  only  if  a  substantial  part 
of  the  genetic  variance  were  due  to  recessive  genes  at  low  frequencies. 
Practical  considerations.  The  extent  to  which  the  theoretical 
changes  of  variance  described  in  this  chapter  can  be  observed  in 
practice  depends  on  how  much  environmental  variance  is  present. 
The  precise  estimation  of  variance  requires  a  large  number  of  obser- 
vations and  the  estimates  obtained  in  practice  are  usually  subject  to 
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rather  large  deviations  due  to  the  chances  of  sampling.  Consequently 
the  changes  of  variance  must  usually  be  quite  substantial  before  they 
are  likely  to  be  readily  detected.  The  genotypic  variance,  moreover, 
seldom  constitutes  the  major  part  of  the  phenotypic  variance. 
Therefore,  in  relation  to  the  original  phenotypic  variance,  the  expected 
changes  due  to  inbreeding  are  usually  rather  small,  and  this  renders 
their  detection  all  the  more  difficult.  Furthermore,  the  detection  of 
the  expected  changes  of  phenotypic  variance  is  entirely  dependent  on 
the  constancy  of  the  environmental  variance,  and  this  cannot  be 
assumed  without  evidence,  as  we  shall  show  in  the  next  section.  For 
these  reasons,  and  also  because  of  the  simplifications  we  have  had  to 
make,  we  must  bear  in  mind  the  uncertainties  in  the  connexion 
between  what  is  expected  and  what  may  be  observed  in  the  pheno- 
typic variance. 


Changes  of  Environmental  Variance 

Several  times  in  previous  chapters  we  have  referred  to  the  fact  that 
the  environmental  component  of  variance  may  differ  according  to 
the  genotype;  in  particular  that  inbred  individuals  often  show  more 
environmental  variation  than  non-inbred  individuals.  This  fact  has 
been  revealed  by  many  experiments  in  which  the  variances  of  inbreds 
and  of  hybrids  have  been  compared.  Any  difference  of  phenotypic 
variance  between  highly  inbred  lines  and  the  F2  between  them  (i.e. 
the  "hybrid")  must  be  attributed  to  a  difference  of  the  environmental 
component,  because  the  genetic  variance  is  negligible  in  amount  in 
the  hybrids  as  well  as  in  the  inbred  lines.  The  greater  susceptibility 
of  inbreds  than  of  hybrids  to  environmental  sources  of  variation  has 
been  observed  in  a  wide  variety  of  characters  and  organisms.  Some 
examples  are  cited  in  Table  15.2;  others  will  be  found  in  the  review 
by  Lerner(i954). 

The  cause  of  the  greater  environmental  variance  of  inbreds  is  not 
yet  fully  understood.  It  has  been  suggested  that  the  possession  of 
different  alleles  at  specific  loci  endows  the  hybrids  with  greater 
' 'biochemical  versatility"  (Robertson  and  Reeve,  19526),  which 
enables  them  to  adjust  their  development  and  physiological  mech- 
anisms to  the  circumstances  of  the  environment:  in  other  words  that 
developmental  and  physiological  homeostasis  is  improved  by  allelic 
diversity.  On  the  other  hand,  it  has  been  suggested  (Mather,  19530) 
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0-0665 


1-24 


0-0165 


that  the  reduced  homeostatic  power  of  inbreds  is  to  be  regarded  as  a 
manifestation  of  inbreeding  depression:  homeostatic  power  is  likely 
to  be  an  important  aspect  of  fitness,  and  would  therefore  be  expected, 
like  other  aspects  of  fitness,  to  decline  on  inbreeding.  The  under- 
lying mechanism,  we  may  presume,  would  be  directional  dominance, 
genes  that  increase  homeostatic  power  tending  on  the  average  to  be 

Table  15.2 

Comparisons  of  Phenotypic  Variance  in  Inbreds  and 
Hybrids 

The  figures  are  the  averages  of  the  inbred  lines,  and  of  the 

Fj's  where  more  than  one  cross  was  made.  (C.V.)2  =  Squared 

coefficient  of  variation. 

Inbreds  Hybrids 

Drosophila  melanogaster — wing  length 
(Robertson  and  Reeve,  19526)  (C.V.)2. 
6  inbreds  and  6  F/s 

Mice — duration  of  "Nembutal"  anaesthesia 
(McLaren  and  Michie,  19566).  Log  minutes. 

2  inbreds  and  1  F1 

Mice — age  at  opening  of  vagina 
(Yoon,  1955).  Days. 

3  inbreds  and  2  F/s 

Mice — weight  at  ages  given 
(Chai,  1957)  (C.V.)2. 

2  inbreds  and  1  Fx 

Rats — weight  at  90  days 
(Livesay,  1930.)  (C.V.)2. 

3  inbreds  and  2  F/s 

dominant  over  their  alleles  that  decrease  it. 
causal  connexion  between  variability  and  fitness.  He  believes  greater 
stability  to  be  a  general  property  of  heterozygotes  and  regards  it  as 
the  cause  of  their  greater  fitness.  Though  the  increase  of  environ- 
mental variance  on  inbreeding  is  a  phenomenon  of  great  theoretical 
interest  and  some  practical  importance,  too  little  is  known  about  it  to 
justify  a  more  detailed  discussion  of  its  causes  here.  Comprehensive 
discussions  will  be  found  in  Lerner  (1954)  and  Waddington  (1957). 

There  are,  however,  two  further  points  in  connexion  with  the 
phenomenon  that  should  be  mentioned.  The  first  is  a  technical 
matter.    If  the  mean  value  of  the  character  differs  between  inbreds 
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Lerner  (1954)  sees  a 
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and  hybrids,  as  it  frequently  does,  then  it  may  be  difficult  to  decide 
on  a  proper  basis  for  the  comparison  of  the  variances.  It  is  necessary 
to  find  a  measure  of  the  variance  that  does  not  merely  reflect  the 
difference  of  mean  value,  and  for  this  purpose  the  coefficient  of 
variation  is  often  an  appropriate  measure.  The  problem  is  basically  a 
matter  of  the  choice  of  scale,  and  will  be  discussed  again  in  Chapter 

The  second  point  concerns  the  nature  of  the  environmental 
variation  that  is  being  measured.  There  is  a  distinction  to  be  made 
between  the  "developmental"  variation  arising  from  "accidents  of 
development"  on  the  one  hand,  and  adaptive  reponses  to  changed 
conditions  on  the  other.  The  developmental  variation  is  a  mani- 
festation of  incomplete  buffering,  or  canalisation,  of  development  and 
is  generally  regarded  as  being  harmful.  Inbreds,  in  so  far  as  they 
show  a  greater  amount  of  developmental  variation,  are  therefore  less 
fit  than  hybrids;  they  are  less  well  able  to  adjust  their  development  to 
different  conditions  of  the  environment  so  as  to  achieve  the  optimal 
phenotype.  An  adaptive  response,  in  contrast,  is  a  modification  of  the 
phenotypic  value  that  is  beneficial  to  the  individual,  such  as  for 
example  the  thickening  of  the  coat  of  mammals  in  response  to  low 
temperature.  If  the  greater  fitness  of  hybrids  over  inbreds  extends  to 
adaptive  responses  we  should  therefore  expect  hybrids  to  show  more 
variation  of  this  sort  than  inbreds.  Thus  the  nature  of  the  environ- 
mental variation  has  an  important  bearing  on  the  interpretation  of  a 
difference  of  variability  between  inbreds  and  hybrids. 


Uniformity  of  Experimental  Animals 

Inbred  strains  of  laboratory  animals,  particularly  of  mice,  are 
widely  used  as  experimental  material  in  pharmacological,  physio- 
logical, and  nutritional  laboratories,  when  uniformity  of  biological 
material  is  desired.  In  some  kinds  of  work,  work  for  example  which 
demands  the  absence  of  immunological  reactions,,  it  is  genetic  uni- 
formity that  is  required,  and  abundant  experience  has  shown  that  the 
inbred  strains  of  mice  fully  satisfy  this  requirement.  In  spite  of 
doubts  about  how  effective  natural  selection  for  heterozygotes  may  be 
in  delaying  the  progress  towards  homozygosity,  these  strains  have 
been  proved  in  practice  to  be  genetically  uniform.  In  the  course  of 
their  maintenance,  however,  strains  inevitably  become  split  up  into 
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sublines,  and  it  is  only  within  a  subline  that  their  genetic  uniformity 
can  be  relied  on.    Recent  work,  described  in  the  two  following 
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WHITE  =  5  VERTEBRAE    (+ ASYMMETRICAL) 
BLACK=  6  VERTEBRAE 

Fig.  15.3.  Differentiation  between  sublines  of  the  C3H  inbred 
strain  of  mice,  in  the  number  of  lumbar  vertebrae.  Each  circle 
represents  a  sample  of  individuals  classified  for  the  number  of 
lumbar  vertebrae.  The  proportions  of  black  and  white  in  the 
circles  show  the  proportions  of  individuals  with  6  and  with  5 
lumbar  vertebrae  respectively.  (Small  proportions  of  asymmetrical 
individuals  are  included  with  the  5 -vertebra  classes.)  The  circles 
are  positioned  according  to  the  date  of  clasification,  and  arranged 
according  to  their  pedigree  relationships.  (Data  from  McLaren 
and  Michie,  1954.) 

examples,  has  revealed  genetic  differentiation  within  two  widely  used 
strains  of  mice,  and  has  shown  that  differences  can  sometimes  be 
detected  between  sublines  separated  by  only  a  few  generations. 
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Example  15.1.  The  inbred  strain  of  mice  known  as  C3H  exhibits 
variability  in  the  number  of  lumbar  vertebrae,  and  the  sublines  differ 
markedly  in  this  character.  Some  sublines  consist  entirely  of  mice  with 
5  vertebrae,  others  entirely  of  mice  with  6,  and  others  with  different  pro- 
portions. The  strain  originated  in  1920  and  was  split  into  three  main 
groups  of  sublines  in  about  1930,  each  group  being  later  subdivided 
further.  The  number  of  lumbar  vertebrae  has  been  studied  in  16  sublines 
maintained  in  America  and  Britain  (McLaren  and  Michie,  1954).  The 
pedigree  relationships  between  these  sublines,  and  the  proportions  of  the 
two  vertebral  types  in  them,  are  shown  in  Fig.  15.3.  One  of  the  three  main 
groups  of  sublines  has  predominantly  6  lumbar  vertebrae,  and  the  other 
two  groups  predominantly  5.  This  differentiation  between  the  main 
groups  may  have  been  due  to  residual  segregation  in  the  strain  at  the  time 
when  the  main  groups  became  separated.  The  strain  had,  however,  been 
full-sib  mated  for  10  years — probably  between  20  and  30  generations — 
before  the  separation  of  the  groups,  and  residual  segregation  therefore 
seems  unlikely.  The  sublines  within  the  main  groups  are  differentiated  in 
a  manner  that  points  to  mutation  rather  than  residual  segregation  as  the 
cause.  The  mutational  origin  of  differentiation  is  more  clearly  proved  in 
the  study  described  in  the  next  example. 

Example  15.2.  Another  inbred  strain  of  mice,  known  as  C57BL,  has 
been  the  subject  of  a  thorough  study  by  Griineberg  and  co-workers  (Deol, 
Griineberg,  Searle,  and  Truslove,  1957;  Carpenter,  Griineberg,  and  Rus- 
sell, 1957).  Twenty-seven  skeletal  characters  were  examined  in  four  main 
groups  of  sublines,  three  maintained  in  America  and  one  in  Britain,  the 
British  group  being  studied  in  greater  detail.  The  nature  and  extent  of  the 
differentiation  found  cannot  be  easily  summarised,  and  therefore  we  shall 
only  state  the  conclusions  reached  about  the  cause  of  the  differentiation. 
Each  of  the  four  main  groups  differed  from  the  others  in  between  7  and  17 
out  of  the  27  characters.  The  following  conclusions  were  drawn:  (1)  The 
differentiation  could  not  reasonably  be  attributed  to  residual  segregation 
before  the  separation  of  the  sublines;  and  segregation  following  an  acci- 
dental outcross  was  conclusively  disproved.  (2)  Sublines  that  had  been 
separated  for  a  longer  time  tended  to  differ  by  a  greater  number  of  charac- 
ters than  sublines  more  recently  separated.  But  the  magnitude  of  the 
difference  in  any  one  character  was  no  greater  between  long-separated 
sublines  than  between  sublines  only  recently  separated.  From  this  it  was 
concluded  that  the  differences  in  each  character  were  caused  by  mutations 
at  single  loci.  The  average  difference  caused  by  one  mutational  step 
amounted  to  about  o-6  standard  deviation  of  the  character  affected. 

The  study  cited  in  the  above  example  shows  that  the  differences 
between  sublines,  though  they  may  be  readily  detectable,  are  prob- 
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ably  caused  by  rather  few  loci.  The  differentiation  is  quite  small  in 
comparison  with  the  differences  between  strains  or  between  indi- 
viduals in  a  non-inbred  population. 

In  much  of  the  work  for  which  inbred  strains  are  used  it  is  not 
the  genetic  uniformity  alone  that  matters,  but  the  phenotypic  uni- 
formity. The  more  variable  the  animals  the  larger  the  number  that 
must  be  used  to  attain  a  given  degree  of  precision  in  measuring  their 
mean  response  to  a  treatment.  The  value  of  uniformity  is  therefore 
in  reducing  the  number  of  animals  that  must  be  used  in  an  experi- 
ment or  a  test.  Inbred  animals,  however,  are  costly  to  produce 
because  of  their  poor  breeding  qualities,  and  the  advantage  gained 
from  genetic  uniformity  has  to  be  weighed  against  the  extra  cost  of 
the  material.  If  the  character  to  be  measured  is  one  of  which  the 
phenotypic  variance  is  chiefly  environmental  in  origin,  then  the 
absence  of  genetic  variation  in  an  inbred  strain  will  reduce  the  pheno- 
typic variance  by  only  a  small  amount.  The  extra  cost  of  the  inbred 
animals  may  then  outweigh  the  advantage  of  their  being  slightly 
more  uniform  than  non-inbred  animals.  The  phenotypic  uniformity 
of  inbred  animals,  however,  has  been  taken  on  trust  from  the  genetical 
theory  of  inbreeding,  and  it  seems  now  that  this  trust  has,  to  some 
extent  at  least,  been  misplaced.  In  some  characters  inbred  animals 
are  more  phenotypically  variable  than  non-inbred  (see  Table  15.4) 
on  account  of  their  greatly  increased  environmental  variation.  It 
seems  now  that  for  some,  perhaps  for  many,  characters  the  greatest 
phenotypic  uniformity  is  found  in  hybrids  (i.e.  F^s)  produced  by 
crossing  two  inbred  strains.  The  value  of  hybrids  for  work  requiring 
phenotypic  uniformity  has  been  discussed  by  Griineberg  (1954);  and 
by  Biggers  and  Claringbold  (1954). 

One  final  point  about  the  use  of  inbred  and  hybrid  animals  may 
be  noted.  An  inbred  strain  or  the  Fx  of  two  inbred  strains  has  a 
unique  genotype;  and  that  of  an  inbred,  moreover,  is  one  that  cannot 
occur  in  a  natural  population.  Testing  the  response  to  any  treatment 
on  one  inbred  strain  or  one  hybrid  is  therefore  testing  it  on  one  geno- 
type. If  there  are  appreciable  differences  of  response  between 
different  genotypes,  the  experimenter  is  then  not  justified  in  describ- 
ing his  results  as  referring,  for  example,  to  "the  mouse." 


CHAPTER    16 

INBREEDING  AND   CROSSBREEDING: 

III.  The  Utilisation  of  Heterosis 

The  crossing  of  inbred  lines  plays  a  major  role  in  the  present  methods 
of  plant  improvement,  though  in  animal  improvement  it  plays  a  much 
less  important  part.  In  this  chapter  the  genetic  principles  underlying 
the  use  of  inbreeding  and  crossing  will  be  explained,  and  the  various 
methods  described  in  outline.  Technical  details,  however,  will  not  be 
given:  for  these  the  reader  should  consult  a  textbook  of  plant  breeding 
(e.g.  Hayes,  Immer,  and  Smith,  1955).  We  shall  be  concerned  with 
outbreeding  plants  and  with  animals.  But  since  at  first  sight  the 
methods  applicable  to  naturally  self-fertilising  plants  are  super- 
ficially rather  like  those  applicable  to  outbreeding  plants  and  animals, 
it  will  be  advisable  first  to  consider  very  briefly  the  improvement  of 
self-fertilising  plants. 

Self-fertilising  plants.  Each  variety  of  a  naturally  self-fertilising 
plant  is  a  highly  inbred  line,  and  the  only  genetic  variation  within  it  is 
that  arising  from  mutation.  Genetic  improvement  can  therefore  be 
made  only  by  choosing  the  best  of  the  existing  varieties  or  by  crossing 
different  varieties.  The  purpose  of  the  crossing  is  to  produce  genetic 
variation  on  which  selection  can  operate.  After  a  cross  has  been 
made,  the  Fx  and  subsequent  generations  are  allowed  to  self -fertilise 
naturally.  A  new  population,  subdivided  into  lines,  is  thus  made,  and 
the  lines  become  differentiated  as  the  inbreeding  proceeds.  Selection 
is  applied  by  choosing  the  best  lines,  which  become  new  and  im- 
proved varieties.  The  essential  point  to  note  is  that  what  is  sought  is 
an  improved  inbred  line,  and  not  a  superior  crossbred  generation:  the 
purpose  of  the  crossing  is  to  provide  genetic  variation  and  not  to 
produce  heterosis.  The  process  of  crossing  and  selection  among  the 
subsequent  lines  may  be  repeated  cyclically.  If  two  good  lines  are 
selected  out  of  the  first  cross,  these  may  be  crossed  and  a  second  cycle 
of  selection  applied  to  the  derived  lines.  The  genetic  properties  of  a 
population  derived  from  a  cross  of  two  highly  inbred  lines,  such  as 
two  varieties  of  a  self-fertilising  plant,  are  peculiar  in  that  all  segre- 
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gating  genes  have  a  frequency  of  0-5  in  the  population  as  a  whole. 
This  greatly  simplifies  the  theoretical  description  of  the  variances  and 
covariances.  Special  methods  of  analysis  applicable  to  such  popula- 
tions have  been  developed  which  lead  to  a  separation  of  the  additive, 
dominance,  and  epistatic  effects,  and  so  provide  a  guide  to  the  possi- 
bilities of  improvement  in  the  population  of  lines  derived  from  a 
particular  cross.  For  a  description  of  these  methods,  see  Mather 
(1949),  Hayman  (1958),  and  Kempthorne  (1957,  Ch.  21)  where  other 
references  are  given. 

Outbreeding  plants,  and  animals.  Applied  to  naturally  out- 
breeding plants  and  to  animals,  the  purpose  of  crossing  inbred  lines  is 
to  produce  superior  cross-bred,  or  F1}  individuals.  The  utilisation  of 
heterosis  in  this  way  depends  on  selection  as  well  as  on  the  inbreeding 
and  crossing.  The  selection  is  applied,  in  principle,  to  the  crosses, 
with  the  aim  of  finding  pairs  of  lines  that  cross  well,  so  that  the  lines 
may  be  perpetuated  and  provide  cross-bred  individuals  for  com- 
mercial use.  In  practice,  however,  the  performance  of  the  lines 
themselves  has  to  be  taken  into  account,  because  the  lines  must  be 
reasonably  productive  if  they  are  to  be  maintained  and  used  for 
crossing.  This  method  has  been  very  successful  with  plants,  and  has 
led  to  an  improvement  of  50  per  cent  in  the  yield  of  maize  grown 
commercially  in  the  United  States,  since  hybrid  seed  started  to  be 
used  in  the  early  1930's  (Mangelsdorf,  195 1).  Its  success  with 
animals,  however,  has  been  much  less  notable.  The  reasons  probably 
lie  chiefly  in  the  greater  amount  of  space  and  labour  required  by 
animals  and  in  their  lower  reproductive  rate,  both  of  which  add 
greatly  to  the  difficulty  of  producing  and  testing  the  inbred  lines. 
During  the  inbreeding  a  large  proportion  of  the  lines  die  out  from 
inbreeding  depression  before  a  reasonably  high  degree  of  inbreeding 
has  been  attained.  Consequently  the  inbreeding  programme  must 
start  with  a  very  large  number  of  lines  if  enough  are  to  be  left  after  the 
wastage  to  give  some  scope  for  the  selection  of  good  crosses.  Another 
point  is  that  with  plants  that  can  be  self-fertilised,  such  as  maize,  the 
inbreeding  proceeds  much  faster  than  with  animals.  To  attain  an 
inbreeding  coefficient  of,  say,  90  per  cent  would  require  only  4 
years  for  maize,  but  1 1  years  for  pigs  or  chickens,  and  about  50  years 
for  cattle  with  a  4-  or  5 -year  generation  interval. 

Let  us  now  consider  the  genetic  principles  on  which  the  utilisa- 
tion of  heterosis  depends.  It  was  shown  in  Chapter  14  that  crosses 
made  at  random  between  lines  inbred  without  selection  are  expected 
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to  have  a  mean  value  equal  to  that  of  the  base  population.  This  is 
the  reason  why  inbreeding  and  crossing  alone  cannot  be  expected  to 
lead  to  an  improvement,  but  must  be  supplemented  by  selection.  In 
practice  some  improvement  can  be  expected  from  the  effects  of 
natural  selection.  It  eliminates  lethal  and  severely  deleterious  genes 
during  the  inbreeding,  and  in  so  far  as  these  genes  affect  the  desired 
character  an  improvement  of  the  cross-bred  mean  over  that  of  the 
base  population  is  to  be  expected.  But  this  improvement  will  not  be 
very  great,  because  the  deleterious  genes  eliminated  will  have  been  at 
low  frequencies  in  the  base  population — and  the  more  harmful,  the 
lower  the  frequency — so  that  their  effect  on  the  population  mean  will 
be  small.  It  has  been  calculated,  on  the  basis  of  assumptions  about 
the  number  of  loci  concerned  and  their  mutation  rates,  that  an  im- 
provement of  5  per  cent  in  fitness  is  the  most  that  could  be  expected 
from  the  elimination  of  deleterious  recessive  genes  (Crow,  1948,  1952). 
The  bulk  of  the  improvement,  therefore,  must  come  from  artificial 
selection  applied  to  the  economically  desirable  characters. 

The  crossing  of  inbred  lines  produces  no  genotypes  that  could  not 
occur  in  the  base  population.  But  whereas  the  best  genotypes  occur 
only  in  certain  individuals  in  the  base  population,  they  are  replicated 
in  every  individual  of  certain  crosses.  It  is  in  this  replication  of  a 
desirable  genotype  that  the  chief  merit  of  the  method  lies.  Let  us,  for 
simplicity,  consider  crosses  between  fully  inbred  lines.  The  gametes 
produced  by  a  highly  inbred  line  are  all  identical,  except  for  mutation. 
And  the  gene  content  of  the  gametes  of  any  one  line  could  in  principle 
be  found  in  a  gamete  from  the  base  population.  Therefore  the  geno- 
type of  the  Fx  of  two  lines  could  in  principle  be  found  in  an  individual 
of  the  base  population.  Thus,  provided  there  has  been  no  selection 
during  the  inbreeding,  a  set  of  crosses  made  at  random  is  genetically 
equivalent  to  a  set  of  individuals  taken  at  random  from  the  base  popu- 
lation; and  the  individuals  of  one  cross  are  replicates  of  one  individual 
in  the  base  population.  This  replication  of  a  genotype  in  the  indi- 
viduals of  a  cross  allows  the  genotypic  value  to  be  measured  with  little 
error;  whereas  the  genotypic  value  of  an  individual  in  the  base  popu- 
lation is  only  crudely  measured  by  its  phenotypic  value.  Further,  it  is 
the  genotypic  value  that  is  measured  in  the  cross  and  can  be  repro- 
duced indefinitely,  as  long  as  the  inbred  lines  are  maintained;  whereas 
only  the  breeding  value  can  be  reproduced  by  selection  of  individuals 
in  a  non-inbred  population.  Therefore  the  condition  under  which 
inbreeding  and  crossing  are  likely  to  be  a  better  means  of  improvement 
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than  selection  without  inbreeding  is  when  much  of  the  genetic 
variance  of  the  character  is  non-additive. 

The  amount  of  improvement  that  can  be  made  by  selection  among 
a  number  of  crosses  depends  on  the  amount  of  variation  between  the 
crosses.  The  same  relationship  holds  between  the  intensity  of  selec- 
tion, the  standard  deviation,  and  the  selection  differential  as  was 
described  in  Chapter  n  and  illustrated  in  Fig.  11.3.  In  the  following 
section  the  variance  between  crosses  made  at  random  between  pairs 
of  lines  inbred  without  selection  will  be  examined. 


Variance  between  Crosses 


The  variance  between  crosses  to  be  considered  is  the  variance  of 
the  true  means  of  the  crosses,  or  the  between-cross  component  as 
estimated  from  an  analysis  of  variance.  The  variance  of  the  observed 
means  will  contain  a  fraction  of  the  within-cross  component  for  the 
reasons  explained  in  connexion  with  family  selection  in  Chapter  13. 
We  shall  assume  that  the  experimental  design  has  eliminated  all 
non-genetic  sources  of  variation  from  the  between-cross  component. 

If  the  lines  crossed  are  fully  inbred  there  will  be  no  genetic  vari- 
ance within  the  crosses,  and  the  variance  between  crosses  will  be 
equal  to  the  genotypic  variance  in  the  base  population,  since  each 
cross  is  equivalent  to  an  individual  of  the  base  population.  When  the 
lines  are  only  partially  inbred,  however,  some  genetic  variance  will 
appear  within  the  crosses,  and  the  between-cross  variance  will  be  less 
than  with  fully  inbred  lines.  It  is  therefore  important  to  know  in 
what  manner  the  between-cross  variance  increases  as  inbreeding 
proceeds,  since  this  will  tell  us  how  much  is  to  be  gained  by  proceed- 
ing to  high  levels  of  inbreeding. 

We  noted  that  crosses  between  fully  inbred  lines  are  genetically 
equivalent  to  single  individuals  of  the  base  population.  Crosses 
between  partially  inbred  lines  are  analogous,  not  to  individuals,  but 
to  families,  with  degrees  of  relationship  dependent  on  the  inbreeding 
coefficient  of  the  lines.  The  variance  between  families  can  be  formu- 
lated in  terms  of  the  degree  of  relationship  in  the  families  (Kemp- 
thorne,  1954),  and  this  formulation  may  be  extended  to  crosses  by 
regarding  the  crosses  as  families  with  a  relationship  depending  on  the 
inbreeding  coefficient  of  the  lines.  The  following  expression  is  then 
obtained  for  the  component  of  variance  between  crosses: 

T  F.Q.G. 
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Between-cross  variance 

=FV*+F*VD+F*VAA+F*VAD+F*VDD+  (16.1) 

In  this  expression  VA  and  VD  are  the  additive  and  dominance  vari- 
ances in  the  base  population;  VAA,  VAD  and  VDD  are  the  interaction 
components  as  explained  in  Chapter  8;  and  F  is  the  inbreeding 
coefficient  of  the  lines  as  specified  below.  The  interaction  components 
are  included  because  epistasis  may  have  important  effects.  Only 
two-factor  interactions,  however,  are  shown:  the  higher  interactions 
have  coefficients  in  correspondingly  higher  powers  of  F.  (For  every 
A  in  the  subscript  there  is  a  factor  F,  and  for  every  D  a  factor  F2.) 
The  formulation  in  equation  16. 1  is  conditional  on  the  following 
specifications  about  how  the  crosses  are  made.  1 .  All  lines  have  the 
same  coefficient  of  inbreeding.  2.  All  lines  have  independent  ancestry 
back  to  the  base  population;  i.e.  there  is  no  relationship  between  the 
lines.  3.  Each  cross  is  made  from  many  individuals  of  the  parent 
lines;  and  these  individuals  are  not  related  to  each  other  within  their 
lines.  This  means  that  the  genetic  variance  within  the  lines  is  fully 
represented  within  the  crosses.  4.  The  coefficient  of  inbreeding,  F, 
refers  not  to  the  individuals  used  as  parents  of  the  crosses,  but  to  their 
progeny  if  they  were  mated  within  their  own  lines;  in  other  words,  F 
is  the  inbreeding  coefficient  of  the  next  generation  of  the  lines. 

Let  us  now  examine  the  expression  16.1  and  consider  what  it  tells 
us  about  the  variance  between  crosses.  When  the  inbreeding  coeffi- 
cient is  unity  the  between-cross  variance  is,  as  we  have  already  stated, 
simply  the  sum  of  all  the  components  of  genetic  variance  in  the  base 
population.  During  the  progress  of  the  inbreeding  the  contribution 
of  the  additive  variance  increases  linearly  with  F;  those  of  the  domin- 
ance variance  and  of  Ax  A  interactions  increases  with  the  square  of 
F;  and  the  other  interaction  components  with  the  third  or  fourth 
power  of  F.  This  means  that  the  dominance  and  interaction  com- 
ponents contribute  proportionately  more  at  higher  levels  of  inbreed- 
ing than  at  lower  levels.  If  the  character  is  one  with  predominantly 
non-additive  variance,  the  crosses  will  differ  little  in  merit  during  the 
early  stages  but  will  differentiate  rapidly  in  the  final  stages.  Since  this 
is  the  sort  of  character  for  which  inbreeding  and  crossing  is  likely  to 
be  the  most  effective  means  of  improvement,  it  is  clear  that  inbreed- 
ing must  be  taken  to  a  fairly  high  level  if  anything  approaching  its  full 
benefit  is  to  be  realised.  Some  idea  of  the  level  of  inbreeding  required 
can  be  obtained  by  noting  that  with  F  =  0-5  the  between-cross  vari- 
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ance  is  equal  to  the  variance  between  full-sib  families  in  the  base 
population.  At  this  level  of  inbreeding,  therefore,  the  best  cross  would 
do  no  more  than  replicate  the  best  full-sib  family  in  a  non-inbred 
population. 

Combining  ability.  The  components  of  genetic  variance  making 
up  the  between-cross  variance  that  we  have  been  discussing  are 
causal  components,  in  the  sense  explained  in  Chapter  9.  The  vari- 
ance between  crosses,  however,  can  also  be  analysed  into  observa- 
tional components  in  the  following  way.  Suppose  a  set  of  lines  are 
crossed  at  random,  each  line  being  simultaneously  crossed  with  a 
number  of  others.  We  can  then  calculate  for  each  line  its  mean  per- 
formance, i.e.  the  mean  value  of  the  Fj/s  in  crosses  with  other  lines. 
This  is  known  as  the  general  combining  ability  of  the  line.  The 
performance  of  a  particular  cross  may  deviate  from  the  average 
general  combining  ability  of  the  two  lines,  and  this  deviation  is 
known  as  the  special  (or  specific)  combining  ability  of  the  cross.  Or,  if 
we  measure  the  mean  values  as  deviations  from  the  general  mean  of 
all  crosses,  we  can  express  the  value  of  a  certain  cross  as  the  sum  of 
the  general  combining  abilities  of  the  two  lines  and  the  special 
combining  ability  of  the  pair  of  lines.  Thus  the  mean  value  of  the 
cross  of  line  X  with  line  Y  is 


MXY  =  G.C.X  +  G.C.Y  +  S.C.XY 


(16.2) 


where  G.C.  and  S.C.  stand  for  the  general  and  special  combining 
abilities.  The  variance  between  crosses  can  therefore  be  analysed 
into  two  components:  variance  of  general  combining  abilities  and 
variance  of  special  combining  abilities;  the  latter  being,  in  statistical 
terms,  the  interaction  component. 

The  observational  components  of  variance  attributable  to  general 
and  special  combining  ability  are  made  up  of  the  causal  components 
in  the  following  way. 


(16.3) 


Variance  of  crosses  attributable  to: 

General  combining  ability  =FVA  +F2  V AA  + . . .  \ 

Special  combining  ability  =FWD  +FWAD  +FWDD  + . . .  J 

So  differences  of  general  combining  ability  are  due  to  the  additive 
genetic  variance  in  the  base  population,  and  to  Ax  A  interactions; 
and  differences  of  special  combining  ability  are  attributable  to  the 
non-additive  genetic  variance.   Consequently  the  variance  of  general 
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combining  ability  increases  linearly  with  F  (apart  from  the  interaction 
component),  while  the  variance  of  special  combining  ability  increases 
with  higher  powers  of  F.  It  is  therefore  the  special,  and  not  the 
general,  combining  ability  that  is  expected  to  increase  more  rapidly 
as  the  inbreeding  reaches  high  levels. 

Example  16.1.  An  analysis  of  egg-laying  in  crosses  between  highly 
inbred  lines  of  Drosophila  melanogaster  is  reported  by  Gowen  (1952). 
Five  lines  were  crossed  in  all  ways,  including  reciprocals,  and  the  numbers 
of  eggs  laid  by  females  in  the  fifth  to  ninth  days  of  adult  life  were  recorded. 
The  analysis  of  the  crosses  yielded  the  following  percentage  composition 
of  the  variance  of  egg  number: 

Variance  component  %  of  total 
General  combining  ability  11-3 

Special  combining  ability  9-7 

Differences  between  reciprocals  2-3 

Within  crosses  76-6 

Thus  about  half  the  variance  between  crosses  was  due  to  general,  and  half 
to  special,  combining  ability. 


Some  of  the  methods  of  improvement  by  crossing  aim  at  utilising 
only  the  variance  of  general  combining  ability,  and  then  the  measure- 
ment of  the  general  combining  ability  of  the  lines  becomes  an  im- 
portant procedure.  In  addition  to  the  making  of  specific  crosses 
between  the  lines,  there  are  two  other  methods  of  measuring  general 
combining  ability.  A  method  convenient  for  use  with  plants  is  known 
as  the  polycross  method.  A  number  of  plants  from  all  the  lines  to  be 
tested  are  grown  together  and  allowed  to  pollinate  naturally,  self- 
pollination  being  prevented  by  the  natural  mechanism  for  cross- 
pollination,  or  by  the  arrangement  of  the  plants  in  the  plot.  The  seed 
from  the  plants  of  one  line  are  therefore  a  mixture  of  random  crosses 
with  other  lines,  and  their  performance  when  grown  tests  the  general 
combining  ability  of  that  line.  Another  method,  applicable  also  to 
animals,  is  known  as  top-crossing.  Individuals  from  the  line  to  be 
tested  are  crossed  with  individuals  from  the  base  population.  The 
mean  value  of  the  progeny  then  measures  the  general  combining 
ability  of  the  line,  because  the  gametes  of  individuals  from  the  base 
population  are  genetically  equivalent  to  the  gametes  of  a  random  set 
of  inbred  lines  derived  without  selection  from  the  base  population. 
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These  methods  are  essentially  methods  for  comparing  the  general 
combining  abilities  of  different  lines,  and  so  leading  to  the  choice  of 
the  lines  most  likely  to  yield  the  best  cross,  among  all  the  crosses  that 
might  be  made  between  the  available  lines.  But  if  much  of  the  varia- 
tion between  crosses  is  due  to  special  combining  ability,  then  the 
general  combining  ability  of  two  lines  will  not  provide  a  reliable 
guide  to  the  performance  of  their  cross. 


Methods  of  Selection  for  Combining  Ability 


The  methods  of  improvement  by  inbreeding  and  crossing  fall  into 
two  groups,  according  to  whether  they  are  designed  to  utilise  only 
the  variation  in  general  combining  ability  or  to  utilise  also  the  varia- 
tion in  special  combining  ability. 

Selection  for  general  combining  ability.  When  the  improve- 
ment of  general  combining  ability  only  is  sought  the  procedure  of 
selection  is  much  simplified.  The  general  combining  abilities  of  all 
available  lines  can  be  measured,  as  already  explained,  without  the 
necessity  of  making  and  testing  all  the  possible  crosses  between  them. 
Some  selection  can  usefully  be  applied  to  the  lines  before  they  are 
tested  in  crosses.  There  is  some  degree  of  correlation  between  a  line's 
performance  as  an  inbred  and  its  general  combining  ability,  so  a 
proportion  of  lines  can  be  discarded  on  the  basis  of  their  own  per- 
formance before  the  crosses  are  made.  And,  finally,  there  is  less 
to  be  lost  by  making  the  crosses  at  a  relatively  low  coefficient  of  in- 
breeding. Selection  for  general  combining  ability  may  be  repeated 
in  cycles,  a  procedure  known  in  plant  breeding  as  recurrent  selection. 
(In  animal  breeding  this  term  has  come  to  have  a  different  meaning, 
as  will  be  explained  below.)  Lines  are  inbred  by  self-fertilisation 
for  one  or  two  generations  and  their  general  combining  abilities 
tested.  The  lines  with  the  best  general  combining  abilities  are  then 
crossed  and  a  second  cycle  of  inbreeding  and  selection  carried  out. 
A  review  of  the  progress  made  by  this  method  is  given  by  Sprague 

(I952). 

The  seed  for  commercial  use  is  usually  not  made  by  a  single  cross 
of  two  lines,  but  by  a  3-way  or  4-way  cross.  The  object  of  this  is  to 
overcome  the  generally  low  production  of  an  inbred  used  as  seed 
parent.  In  a  3-way  cross  the  Fx  of  two  lines  is  used  as  seed  parent  and 
crossed  with  a  third  inbred  line.   In  a  4-way  cross  two  F^s  of  differ- 
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ent  pairs  of  lines  are  crossed.  The  performance  of  3 -way  and  4-way 
crosses  can  be  reliably  predicted  from  the  performance  of  the  con- 
stituent single  crosses. 

Even  though  selection  for  general  combining  ability  is  widely 
used  in  plant  breeding  and  has  abundantly  proved  its  success,  it  is 
not,  perhaps,  altogether  clear  why  it  is  preferred  to  selection  without 
inbreeding,  made  either  by  individual  selection  or  by  family  selection. 
Since  the  variation  in  general  combining  ability  is  attributable  to 
additive  variance  in  the  population  from  which  the  lines  were  derived, 
selection  should  be  effective  without  inbreeding.  Comparisons  of  the 
two  methods  by  experiment  have  not  been  made  on  a  scale  sufficient 
to  prove  convincingly  the  superiority  of  selection  with  inbreeding 
(see  Robinson  and  Comstock,  1955). 

Selection  for  general  and  specific  combining  ability.  The 
specific  combining  ability  of  a  cross  cannot  be  measured  without 
making  and  testing  that  particular  cross.  Therefore  to  achieve  a 
reasonably  high  intensity  of  selection  for  specific  combining  ability  a 
large  number  of  crosses  must  be  made  and  tested.  Is  no  short-cut 
possible?  Could  the  superior  combining  ability  not  be,  as  it  were, 
built  into  the  lines  by  selection?  From  the  causes  of  heterosis  ex- 
plained in  Chapter  14  it  is  clear  that  what  is  wanted  is  two  lines  that 
differ  widely  in  the  gene  frequencies  at  all  loci  that  affect  the  character 
and  that  show  dominance.  It  should  therefore  be  possible  to  build 
up  these  differences  of  gene  frequency  in  two  lines  by  selection. 
Instead  of  the  differences  of  gene  frequency  being  produced  by  the 
random  process  of  inbreeding,  they  would  be  produced  by  the  directed 
process  of  selection,  which  would  be  both  more  effective  and  more 
economical.  Two  methods  based  on  this  idea  have  been  devised. 
These  methods,  though  originating  from  plant  breeding,  provide — 
in  theory  at  least — the  most  hopeful  means  of  utilising  heterosis  in 
animals.  We  shall  first  describe  the  method  known  as  reciprocal 
recurrent  selection,  or  simply  as  reciprocal  selection.  In  outline,  the 
procedure  is  as  follows. 

The  start  is  made  from  two  lines,  say  A  and  B.  (We  shall  call 
them  "lines"  even  though  they  will  not  be  deliberately  inbred.) 
Crosses  are  made  reciprocally,  a  number  of  A  33  being  mated  to 
B  ??,  and  a  number  of  B  33  to  A  $?.  The  cross-bred  progeny  are  then 
measured  for  the  character  to  be  improved  and  the  parents  are  judged 
from  the  performance  of  their  progeny.  The  best  parents  are  selected 
and  the  rest  discarded,  together  with  all  the  cross-bred  progeny,  which 
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are  used  only  to  test  the  combining  ability  of  the  parents.  The  selected 
individuals  must  then  be  remated,  to  members  of  their  own  line,  to  pro- 
duce the  next  generation  of  parents  to  be  tested.  These  are  crossed 
again  as  before  and  the  cycle  repeated.  It  is  seldom  practicable  to  select 
among  the  female  parents,  and  the  selection  is  chiefly  applied  to  the 
males.  Each  male  is  mated  to  several  females  of  the  other  line  so  that 
the  judgment  of  his  combining  ability  may  be  based  on  a  reasonably 
large  number  of  progeny.  Most  of  these  females  are  needed  to  mate  to 
the  selected  males  of  their  own  line  for  the  continuation  of  the  line. 
Deliberate  inbreeding  is  avoided  as  far  as  possible,  for  the  reason  to  be 
explained  below.  The  use  of  all  the  females  as  parents  in  their  own  lines 
helps  to  reduce  the  rate  of  inbreeding  and  allows  relatively  few  males  to 
be  used,  which  intensifies  the  selection. 

An  essential  prerequisite  is  that  there  should  be  some  difference  of 
gene  frequency  between  the  two  lines  at  the  beginning,  or  else  selec- 
tion for  combining  ability  will  be  unable  to  produce  a  differentiation 
of  the  lines.  Any  locus  at  which  the  gene  frequencies  are  the  same  in 
the  two  lines  will  be  in  equilibrium,  though  an  unstable  equilibrium. 
Any  shift  in  one  direction  or  the  other  will  give  the  selection  something 
to  act  on  and  the  difference  will  be  increased.  The  initial  difference 
between  the  lines  may  be  obtained  by  starting  from  two  different 
breeds  or  varieties,  choosing  two  that  already  cross  well;  or  by  de- 
liberate inbreeding,  up  to  perhaps  25  per  cent,  and  relying  on  random 
differentiation  of  gene  frequencies. 

Though  the  performance  of  the  cross  is  expected  to  increase 
under  this  method  of  selection,  the  performance  of  the  lines  them- 
selves in  respect  of  the  character  selected  is  expected  to  decrease,  for 
this  reason.  Characters  to  which  selection  would  be  applied  in  this 
way  are  those  subject  to  inbreeding  depression  and  heterosis;  that  is 
to  say,  those  in  which  dominance  is  directional.  The  changes  of  gene 
frequency  brought  about  by  the  selection  are  toward  the  extremes, 
and  consequently  the  mean  values  of  the  lines  will  decline  for  the 
reasons  explained  in  connexion  with  inbreeding  in  Chapter  14.  This 
decline  in  the  performance  of  the  lines,  however,  should  not  be  quite 
as  deleterious  as  the  effects  of  deliberate  inbreeding.  Inbreeding,  as  a 
random  process,  affects  all  loci,  and  the  mean  values  of  all  characters 
showing  directional  dominance  decline.  But  under  reciprocal  selec- 
tion it  is  only  the  selected  character  that  should  decline,  except  in  so 
far  as  linked  loci  are  carried  along.  Nevertheless,  reproductive  fitness 
is  nearly  always  a  component  of  economic  value,  and  it  is  doubtful 
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how  far  the  distinction  will  hold.  This,  however,  is  the  reason  why 
deliberate  inbreeding  of  the  lines  is  to  be  avoided. 

The  second  method  is  simpler  in  procedure  than  reciprocal 
selection  described  above.  It  was  devised  as  a  modification  of  recur- 
rent selection,  intended  to  utilise  special  as  well  as  general  combining 
ability  (Hull,  1945),  and  as  yet  it  has  no  distinctive  name.  It  is  known 
variously  as  "Hull's  modification  of  recurrent  selection,"  ' 'recurrent 
selection  to  inbred  tester,"  "recurrent  selection  for  special  combining 
ability,"  and  in  animal  breeding  simply  as  "recurrent  selection."  It 
differs  from  reciprocal  selection  in  the  following  way.  Instead  of 
starting  with  two  lines  and  selecting  both  for  combining  ability  with 
the  other,  one  starts  with  only  one  line  and  selects  it  for  combining 
ability  with  a  "tester"  line  which  has  previously  been  inbred.  This 
reduces  the  amount  of  effort  spent  on  the  testing,  and  is  expected  to 
yield  more  rapid  progress  at  the  beginning  because  the  initial  differ- 
ences of  gene  frequency  between  the  line  and  the  tester  are  likely  to 
be  more  marked.  But  the  ultimate  gain  is  expected  to  be  less  than 
under  reciprocal  selection,  because  the  general  combining  ability  of 
the  tester  line  is  predetermined,  and  only  the  general  combining 
ability  of  the  selected  line  and  the  special  combining  ability  of  the 
cross  can  be  improved. 

The  two  methods  of  selection  for  special  combining  ability  de- 
scribed in  this  section  are  comparatively  new  methods  of  improvement 
and  very  little  practical  experience  of  them  has  yet  been  gained.  The 
account  of  them  given  here  is  consequently  based  almost  entirely  on 
theory.  Theoretical  assessments  of  their  merits  in  relation  to  other 
methods  have  been  made  by  Comstock,  Robinson,  and  Harvey 
(1949)  and  by  Dickerson  (1952).  Though  on  theoretical  grounds 
they  seem  promising,  the  results  of  the  only  experiments  so  far  pub- 
lished (Bell,  Moore,  and  Warren,  1955;  Rasmuson,  1956)  are  not 
encouraging. 

Before  we  leave  the  subject  of  inbreeding  we  must  give  some 
further  consideration  to  the  particular  genetic  property  that  makes 
selection  with  inbreeding  and  crossing  preferable  to  selection  without 
inbreeding.  From  the  theoretical  point  of  view,  and  leaving  all  prac- 
tical considerations  aside,  the  crucial  genetic  property  is  over- 
dominance  of  the  genes  concerned.  The  following  section  is  devoted 
to  a  consideration  of  overdominance  and  its  significance. 
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Overdominance  is  the  property  shown  by  two  alleles  when  the 
heterozygote  lies  outside  the  range  of  the  two  homozygotes  in 
genotypic  value  with  respect  to  the  character  under  discussion.  Its 
meaning  was  illustrated  in  Fig.  2.3  with  respect  to  fitness  as  the 
character,  and  it  has  been  mentioned  from  time  to  time  in  other 
chapters.  We  saw  in  Chapter  2  how  selection  favouring  hetero- 
zygotes  leads  to  a  stable  gene  frequency  at  an  intermediate  value,  and 
how  this  overdominance  with  respect  to  fitness  probably  accounts  for 
much  of  the  stable  polymorphism  found  in  natural  populations. 
And  in  Chapter  12  we  saw  how  overdominance  may  be  a  source  of 
non-additive  genetic  variance  in  populations  that  have  reached  their 
limit  under  artificial  selection.  It  is,  however,  in  connexion  with  the 
utilisation  of  heterosis  by  inbreeding  and  crossing,  or  by  reciprocal 
selection,  that  overdominance  has  its  most  important  practical  conse- 
quences. In  earlier  chapters  two  basic  methods  of  improvement 
were  distinguished,  one  being  selection  without  inbreeding,  and  the 
other  inbreeding  followed  by  crossing.  In  this  chapter  we  have  seen 
that  selection  is  an  integral  part  of  the  second  method  also.  The 
essential  distinction  therefore  lies  in  the  crossing,  rather  than  in  the 
selection.  Now,  crossing  two  lines  in  which  different  alleles  are  fixed 
gives  an  F1  in  which  all  individuals  are  heterozygotes;  and  this  is  the 
only  way  of  producing  a  group  of  individuals  that  are  all  heterozy- 
gotes. In  a  non-inbred  population  no  more  than  50  per  cent  of  the 
individuals  can  be  heterozygotes  for  a  particular  pair  of  alleles. 
Consequently,  if  heterozygotes  of  a  particular  pair  of  alleles  are 
superior  in  merit  to  homozygotes,  inbreeding  and  crossing  will  be  a 
better  means  of  improvement  than  selection  without  inbreeding. 
Furthermore,  it  is  only  when  there  is  overdominance  with  respect  to 
the  desired  character,  or  combination  of  characters,  that  inbreeding 
and  crossing  can  achieve  what  selection  without  inbreeding  cannot. 
Under  any  other  conditions  of  dominance  the  best  genotype  is  one  of 
the  homozygotes,  and  all  individuals  can  be  made  homozygous  by 
selection,  without  the  disadvantages  attendant  on  inbreeding  and 
much  more  simply  than  by  methods  dependent  on  crossing.  It  was 
stated  earlier  in  this  chapter  that  the  potentialities  of  inbreeding  and 
crossing  are  greatest  when  there  is  much  non-additive  genetic  vari- 
ance and  little  additive.  Now  we  see  that  this  is  only  part  of  the  truth: 
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in  principle  inbreeding  and  crossing  can  surpass  selection  without  in- 
breeding only  when  a  substantial  part  of  the  non-additive  variance  is 
due  to  over  dominance.  It  is  therefore  of  great  practical  importance 
to  know  whether  overdominance  with  respect  to  economically 
desirable  characters  is  a  major  source  of  variation.  It  is  also  of  great 
theoretical  interest  to  know  whether  overdominance  with  respect  to 
natural  fitness  is  a  common  phenomenon  affecting  many  loci,  because 
natural  selection  favouring  heterozygotes  would  be  a  potent  factor 
tending  to  maintain  genetic  variation  in  populations.  This  point  will 
be  discussed  further  in  Chapter  20. 

The  contribution  of  overdominance  to  the  variance,  and  the  pro- 
portion of  loci  that  show  overdominance,  are  really  two  different 
questions.  Genes  that  are  overdominant  with  respect  to  fitness  will  be 
at  intermediate  frequencies  and  will  therefore  contribute  much 
more  variation  than  genes  at  low  frequencies.  So  overdominance  may 
be  a  major  source  of  variation  and  yet  be  a  property  of  only  a  few 
loci. 

The  evidence  concerning  overdominance  has  been  compre- 
hensively reviewed  by  Lerner  (1954),  who  reaches  the  conclusion  that 
overdominance  with  respect  to  fitness  and  characters  closely  con- 
nected with  it  is  widespread  and  very  important.  A  contrary  view  is 
expressed  by  Mather  (19556)  on  the  grounds  that  much  of  what 
appears  to  be  overdominance  with  respect  to  certain  characters  in 
plants  can  be  attributed  to  epistatic  interaction.  These  two  conflicting 
opinions  will  be  enough  to  show  that  the  problem  of  overdominance 
remains  still  an  open  question.  The  aim  here  is  not  to  discuss  the 
opinions,  but  to  indicate  briefly  the  nature  of  the  evidence. 

The  evidence  concerning  overdominance  is  broadly  speaking  of 
two  sorts,  direct  and  indirect.  The  direct  evidence  comes  from  the 
comparison  of  heterozygotes  and  homozygotes  in  identifiable  geno- 
types. The  indirect  evidence  comes  from  the  study  of  the  expected 
consequences  of  overdominance  as  they  affect  the  genetic  properties  of 
a  population,  or  the  outcome  of  certain  breeding  methods.  Both  sorts 
of  evidence  are  complicated  by  linkage.  We  have  to  distinguish 
between  overdominance  as  a  property  of  a  single  locus,  and  over- 
dominance  as  a  property  of  a  segment  of  chromosome,  which  we  shall 
refer  to  as  apparent  overdominance.  Unequivocal  evidence  of  over- 
dominance  arising  from  a  single  locus  is  scarce  because  it  can  only  be 
obtained  from  a  locus  that  has  mutated  in  a  highly  inbred  line,  or 
from  a  population  in  which  coupling  and  repulsion  linkages  are  in 
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equilibrium.  The  segregation  that  can  be  observed  in  practice,  and 
that  gives  rise  to  the  genetic  variation  in  a  population,  is  usually  not  a 
segregation  of  single  loci  but  of  segments  of  chromosome,  longer  or 
shorter  according  to  the  amount  of  crossing-over.  These  segments 
of  chromosome,  or  units  of  segregation,  can  show  overdominance 
even  though  the  separate  loci  do  not.  All  that  is  needed  to  produce 
some  degree  of  apparent  overdominance  is  two  genes,  linked  in 
repulsion,  and  both  partially  recessive.  Its  most  extreme  form  is  pro- 
duced by  two  lethal  genes  linked  in  repulsion — a  "balanced  lethal" 
system — when  the  heterozygote  of  the  segment  spanned  by  the  two 
loci  is  the  only  viable  genotype. 

In  considering  the  direct  evidence  it  is  necessary  to  recognise  that 
overdominance  may  be  manifested  at  different  "levels"  according  to 
the  complexity  of  the  character  under  discussion.  A  pair  of  alleles 
with  pleiotropic  effects  may  be  found  not  to  exhibit  overdominance 
when  any  of  the  characters  they  affect  is  examined  separately;  yet  if 
natural  fitness  or  economic  merit  is  founded  on  a  combination  of 
these  characters,  the  alleles  may  show  overdominance  with  respect  to 
fitness  or  merit.  Thus  there  may  be  no  overdominance  at  the  lower 
level  of  the  simpler  characters,  but  overdominance  at  the  higher  level 
of  the  more  complex  character. 

Example  16.2.  An  example  of  overdominance  due  to  pleiotropy  is 
provided  by  the  pygmy  gene  in  mice,  already  referred  to  in  several  ex- 
amples in  earlier  chapters.  The  gene  reduces  body  size  and  in  the  homo- 
zygote  it  causes  sterility  (King,  1955).  In  respect  of  body  size  it  is  nearly, 
but  not  quite,  recessive.  In  respect  of  sterility  it  is  probably  also  nearly 
recessive,  though  this  was  not  proved.  In  neither  body  size  nor  sterility 
separately  is  there  overdominance.  But  if  small  size  were  desirable  (as  it 
was  in  the  experiment  in  which  the  gene  was  discovered),  then  under  these 
conditions  the  genotype  with  the  highest  merit  is  the  heterozygote,  since 
the  sterile  homozygotes  cannot  reproduce.  With  respect  to  merit,  or  fitness 
under  these  conditions,  the  gene  therefore  shows  overdominance.  The 
lethal  gene  in  the  line  of  Drosophila  selected  for  high  bristle  number, 
mentioned  in  Chapter  12,  is  another  case  of  the  same  sort  of  overdomin- 
ance; and  so  also  is  the  sickle-cell  anaemia  described  in  Example  2.4. 

The  observations  that  provide  direct  evidence  concerning  over- 
dominance  may  be  briefly  summarised  as  follows.  The  experience 
of  Mendelian  genetics  shows  that  mutant  genes  are  not  commonly 
overdominant  with  respect  to  their  main  effects.  Nor  is  overdomin- 
ance with  respect  to  natural  fitness  at  all  obvious.    Indeed,  if  there 
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were  more  than  a  mild  degree  of  overdominance  with  respect  to 
fitness  a  gene  would  not  be  rare  enough  to  be  classed  as  a  "mutant." 
Though  the  evidence  of  Mendelian  genetics  suggests  that  overdomin- 
ance is  not  a  very  common  property  of  genes,  many  cases  are  never- 
theless known.  Overdominance  due  to  pleiotropy,  such  as  the  cases 
mentioned  in  the  above  example,  are  not  infrequent.  And,  over- 
dominance  with  respect  to  certain  components  of  natural  fitness  has 
been  proved  for  some  of  the  blood  group  genes  in  poultry  (see  Briles, 
Allen,  andMillen,  1957;  Gilmour,  1958). 

The  nature  of  the  indirect  evidence  concerning  overdominance 
is,  in  brief  summary,  as  follows. 

1 .  Experiments  on  the  rate  of  loss  of  genetic  variance  during  in- 
breeding point  to  the  operation  of  natural  selection  in  favour  of 
heterozygotes  (Tantawy  and  Reeve,  1956;  Briles,  Allen,  and  Millen, 
1957;  Gilmour,  1958).  This  indicates  apparent  overdominance,  but 
it  does  not  prove  overdominance  at  the  individual  loci. 

2.  Crow  (1948,  1952)  has  given  reasons  for  thinking  that  the 
yield  of  grain  obtained  from  the  best  crosses  between  inbred  lines  of 
maize  is  too  high  to  be  accounted  for  without  overdominance  at  some 
loci.  The  reasoning  depends  on  assumptions  about  the  number  of 
loci  affecting  yield  and  the  mutation  rates,  and  the  conclusion  is 
therefore  tentative.  Robinson  et  at.  (1956)  point  out  that  the  reason- 
ing cannot  justifiably  be  applied  to  maize  crosses  because  the  lines 
crossed  generally  come  from  different  varieties  and  not  from  the 
same  base  population  as  required  by  Crow's  hypothesis. 

3.  Comstock  and  Robinson  (1952)  have  devised  methods  for 
measuring  the  average  degree  of  dominance  from  measurements 
made  on  non-inbred  populations.  Preliminary  results  from  maize 
(Robinson  and  Comstock,  1955)  suggest  that  there  cannot  be  over- 
dominance  (as  distinct  from  apparent  overdominance)  at  more  than  a 
small  proportion  of  the  loci  that  influence  the  yield  of  grain. 

4.  The  existence  of  polymorphism  in  natural  populations,  asj 
described  in  Chapter  2,  cannot  readily  be  explained  except  by  sup- 
posing that  the  genes  concerned  are  overdominant  with  respect  to 
fitness. 

From  the  foregoing  outline  of  the  evidence  it  is  clear  that  the 
problem  of  how  important  overdominance  is  remains  unsolved. 
Some  of  the  differences  of  opinion  about  it  may  arise  from  different 
views  of  what  phenomena  are  to  be  included  under  the  term — 
whether  apparent  overdominance  due  to  linkage,  or  overdominance 
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jldue  to  pleiotropy,  are  to  be  regarded  as  overdominance  or  not. 
I  Moreover,  the  question  of  how  important  overdominance  is  means 
||  different  things  according  to  whether  we  are  concerned  with  its 
I  frequency  as  a  property  of  genes,  or  with  the  amount  of  variation  it 
I  causes. 


CHAPTER    17 


SCALE 

The  choice  of  a  suitable  scale  for  the  measurement  of  a  metric  charac- 
ter has  been  mentioned  several  times  in  the  foregoing  chapters.  The 
explanation  of  what  is  involved  in  the  choice  of  a  scale  and  a  discussion 
of  the  criteria  of  suitability  have,  however,  been  deferred  till  this 
point  because  these  are  matters  that  cannot  be  properly  appreciated 
until  the  nature  of  the  deductions  to  be  made  from  the  data  are 
understood.  In  other  words  the  choice  of  a  scale  has  to  be  made  in 
relation  to  the  object  for  which  the  data  are  to  be  used.  The  data 
from  any  experimental  or  practical  study  are  obtained  in  the  form 
most  convenient  for  the  measurement  of  the  character.  That  is  to 
say  the  phenotypic  values  are  recorded  in  grams,  pounds,  centimetres, 
days,  numbers,  or  whatever  unit  of  measurement  is  most  convenient. 
The  point  at  issue  is  whether  these  raw  data  should  be  transformed  to 
another  scale  before  they  are  subjected  to  analysis  or  interpretation. 
A  transformation  of  scale  means  the  conversion  of  the  original  units 
to  logarithms,  reciprocals,  or  some  other  function,  according  to  what 
is  most  appropriate  for  the  purpose  for  which  the  data  are  to  be  used. 

It  is  tempting  to  suppose  that  each  character  has  its  "natural" 
scale,  the  scale  on  which  the  biological  process  expressed  in  the 
character  works.  Thus,  growth  is  a  geometrical  rather  than  an  arith- 
metical process,  and  a  geometric  scale  would  appear  to  be  the  most 
' 'natural."  For  example,  an  increase  of  1  gm.  in  a  mouse  weighing 
20  gm.  has  not  the  same  biological  significance  as  an  increase  of  1  gm. 
in  a  mouse  weighing  2  gm.:  but  an  increase  of  10  per  cent  has  ap- 
proximately the  same  significance  in  both.  For  this  reason  a  trans- 
formation to  logarithms  would  seem  appropriate  for  measurements  of 
weight.  This,  however,  is  largely  a  subjective  judgment,  and  some 
objective  criterion  for  the  choice  of  a  scale  is  needed.  There  are 
several  recognised  criteria  (see  Wright,  1952&);  but,  as  Wright  points 
out,  the  different  criteria  are  often  inconsistent  in  the  scale  they  indi- 
cate. And,  moreover,  the  same  criterion  applied  to  the  same  character 
may  indicate  different  scales  in  different  populations.   Therefore  the 
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idea  that  every  character  must  have  its  "natural"  and  correct  scale  is 
largely  illusory. 

In  the  first  chapter  on  metric  characters,  Chapter  6,  it  was  stated 
that  we  should  assume  throughout  that  any  metric  character  under 
discussion  would  be  measured  on  an  "appropriate"  scale,  the 
criterion  being  that  the  distribution  of  phenotypic  values  should 
approximate  to  a  normal  curve.  This  is,  in  principle,  the  chief 
criterion,  and  a  markedly  asymmetrical,  or  skewed,  distribution  is  a 
certain  indication  that  the  data  may  have  to  be  transformed  if  they  are 
to  be  used  in  certain  ways.  But  a  transformation  may  still  be  required 
even  if  the  distribution  is  not  markedly  asymmetrical:  we  shall  see 
below  that  the  most  important  criterion  then  is  that  the  variance 
should  be  independent  of  the  mean.  We  shall  treat  the  choice  of 
scale  in  this  chapter  by  showing  what  will  arise  if  the  transformation 
required  is  not  made.  We  shall  find  that  certain  phenomena  arise, 
called  scale  effects,  which  disappear  when  the  appropriate  transforma- 
tion is  made.  For  the  sake  of  clarity  we  shall  discuss  in  particular  the 
logarithmic  transformation  which  converts  an  arithmetic  to  a  geo- 
metric scale.  This  is  probably  the  commonest  and  most  useful 
transformation.  The  general  principles,  outlined  by  reference  to  the 
log  transformation,  will,  however,  apply  equally  to  other  transforma- 
tions. Let  us  first  consider  the  distribution  of  phenotypic  values. 

Fig.  17.  i  shows  three  distributions  plotted  as  if  from  the  original 
data  on  an  arithmetic  scale.  They  would  all  three  be  symmetrical 
and  normal  if  the  data  were  first  transformed  to  logarithms,  or  plotted 
on  logarithmic  paper.  There  are  two  points  of  importance  to  notice. 
First,  the  degree  of  departure  from  normality  depends  on  the  amount 
of  variation  in  relation  to  the  mean.  This  may  be  seen  from  a  com- 
parison of  the  two  upper  graphs,  (a)  and  (b),  which  are  not  very 
noticeably  asymmetrical,  with  the  lower  graph,  (c),  which  is.  The 
relationship  between  the  amount  of  variation  and  the  mean,  which 
determines  the  degree  of  departure  from  normality,  is  best  expressed 
as  the  coefficient  of  variation;  i.e.  the  ratio  of  standard  deviation  to 
mean,  often  multiplied  by  100  to  bring  it  to  a  percentage.  The 
coefficient  of  variation  of  the  two  upper  graphs  is  20  per  cent,  while 
that  of  the  lower  graph  is  50  per  cent.  Thus,  a  transformation  to 
logarithms  does  not  make  an  appreciable  difference  to  the  shape  of  the 
distribution  unless  the  coefficient  of  variation  is  fairly  high — that  is, 
above  about  20  per  cent  or  so.  Consequently,  statistical  procedures 
which  do  not  rely  on  a  strictly  normal  distribution,  such  as  the  ana- 
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lysis  of  variance,  can  be  carried  out  on  the  untransformed  data  when 
the  coefficient  of  variation  is  not  above  about  20  per  cent.  Trans- 
formations to  other  scales  are  also  less  necessary  when  the  coefficient 
of  variation  is  low  than  when  it  is  high. 

The  second  point  to  notice  in  Fig.  17. 1  is  that  the  variance,  when 
computed  in  arithmetic  units,  increases  when  the  mean  increases. 
This  may  be  seen  in  the  two  upper  graphs,  (a)  and  (b).   These  have 


Fig.  17.  i.  Distributions  that  are  symmetrical  and  normal  on  a 
logarithmic  scale  shown  plotted  on  an  arithmetic  scale.  Explana- 
tion in  text. 


both  the  same  variance  in  logarithmic  units,  but  different  means. 
The  mean — or  strictly  speaking  the  mode — of  (b)  is  double  that  of  (a) 
and  the  standard  deviation  in  arithmetic  units  is  correspondingly 
doubled.  Though  the  distributions  are  not  very  noticeably  skewed 
and  a  transformation  does  not  seem  to  be  very  strongly  indicated,  yet 
in  consequence  of  the  difference  of  mean  the  variances  differ  very 
greatly.  Here,  then,  is  one  of  the  commonest  scale  effects,  namely  a 
change  of  variance  following  a  change  of  the  population  mean.  The 
two  graphs  (a)  and  (b)  in  Fig.  17.1  might  well  represent  two  popula- 
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tions  which  have  diverged  by  some  generations  of  two-way  selection, 
if  the  character  were  something  like  body  weight  measured  in  grams 
or  pounds.  Such  characters  are  commonly  found  to  increase  in 
variance  when  the  mean  increases  and  to  decrease  in  variance  when 
the  mean  decreases.  Fig.  17.2  shows  an  example  from  an  experiment 
with  mice  (MacArthur,  1949),  the  character  being  weight  at  60  days. 
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Fig.  17.2.  Distributions  of  body  weight  of  male  mice  at  60  days. 
Centre:  base  population  before  selection.  Left  and  right:  small 
and  large  strains  after  21  generations  of  two-way  selection.  (Re- 
drawn from  MacArthur,  1949.) 

Small       Unselected       Large 
Standard  deviation  171  2*56  5-10 

Coeff.  of  variation,   %  14-3  ii-i  12-8 

Phenomena  such  as  the  change  of  variance  discussed  above  are 
called  scale  effects  if  they  disappear  when  the  measurements  are 
appropriately  transformed:  in  other  words,  if  their  cause  can  be 
attributed  to  the  scale  of  measurement.  But  they  are  none  the  less 
real,  though  labelled  as  a  scale  effect  or  removed  by  transformation. 
The  large  mice,  for  example,  are  really  more  variable  than  the  small 
when  their  weights  are  measured  in  grams.  What  is  gained  by  recog- 
nising this  as  a  scale  effect  is  that  there  is  no  need  to  look  deeper  into 
the  genetic  properties  of  the  character  for  an  explanation. 

A  convenient  test  for  the  appropriateness  of  a  logarithmic  trans- 
formation is  provided  by  the  proportionality  of  standard  deviation 
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and  mean,  which  we  noted  in  connexion  with  graphs  (a)  and  (b)  in 
Fig.  17.  i.  If  two  distributions  have  the  same  variance  on  a  logarith- 
mic scale  then  the  coefficients  of  variation  in  arithmetic  units  will  be 
the  same.  Thus,  constancy  of  the  coefficient  of  variation  indicates 
constancy  of  variance  on  a  logarithmic  scale.  And,  if  variances  are  to 
be  compared,  we  may  simply  compare  the  coefficients  of  variation 
instead  of  expressing  the  variances  in  logarithmic  units.  The  stand- 
ard deviations  and  coefficients  of  variation  of  the  distributions  shown 
in  Fig.  17.2  are  given  in  the  legend  to  the  figure.  The  coefficients  of 
variation,  though  not  identical,  are  much  more  alike  than  the  stand- 
ard deviations,  and  this  shows  that  the  changes  of  variance  that  have 
resulted  from  the  selection  can  be  attributed,  in  large  part  at  least,  to 
the  scale  of  measurement. 

The  effect  of  scale  on  the  connexion  between  variance  and  mean 
complicates  the  comparison  of  the  variances  of  two  populations  that 
differ  also  in  mean,  as  for  example  the  comparison  of  the  variances  of 
inbreds  and  hybrids  discussed  in  Chapter  15.  If  a  difference  of 
variance  is  to  be  unambiguously  attributed  to  a  difference  of  homeo- 
static  power,  for  example,  there  must  be  independent  grounds  for 
believing  that  a  similar  difference  would  not  be  expected  as  a  scale 
effect  connected  with  the  difference  of  mean. 

Let  us  return  to  the  consequences  of  selection  and  pursue  them  a 
little  further.  If  the  variance  changes  with  the  change  of  mean  as  a 
result  of  selection,  so  also  will  the  selection  differential  and  the 
response.  The  response  per  generation  of  a  character  such  as  we  have 
been  considering  would  therefore  be  expected  to  increase  with  the 
progress  of  selection  in  the  upward  direction,  and  to  decrease  corre- 
spondingly in  the  downward  direction.  The  response  to  two-way 
selection  would  then  be  asymmetrical.  An  example  of  an  asymmetri- 
cal response  which  can  most  probably  be  attributed  to  a  scale  effect 
in  this  way  is  shown  in  Fig.  17.3.  Plotted  in  arithmetic  units,  as  in 
(a),  the  response  is  much  greater  in  the  upward  than  in  the  downward 
direction.  A  transformation  to  logarithms,  shown  in  (b),  renders  the 
response  much  more  nearly  symmetrical.  This  does  not  do  away 
with  the  fact  that  the  character  as  measured  increased  much  more  than 
it  decreased  under  selection.  But  it  accounts  for  the  asymmetry 
without  the  need  for  more  elaborate  hypotheses.  A  convenient  way  of 
eliminating  scale  effects  from  the  graphical  presentation  of  a  response 
to  selection  is  to  plot  the  response  in  the  form  of  the  realised  herit- 
ability,  as  explained  in  Chapter  11  and  illustrated  in  Fig.  11.5.   The 
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realised  heritability,  which  is  the  ratio  of  response  to  selection  differ- 
ential, is  very  little  influenced  by  scale  effects  (Falconer,  1954a:). 

When  means  or  variances  are  to  be  compared,  for  example  in  a 
comparison  of  two  populations  or  in  following  the  changes  resulting 
from  selection,  and  a  transformation  to  logarithms  is  indicated,  it  is 
not  necessary  to  convert  each  individual  measurement.   On  the  other 
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Fig.  17.3.  Response  to  two-way  selection  for  resistance  to  dental 
caries  in  rats.  Resistance  is  measured  in  days  and  plotted  on  an  arith- 
metic scale  in  (a),  and  on  a  logarithmic  scale  in  (b).  The  arithmetic 
means  were  converted  to  logarithmic  means  by  formula  17. 1.  The 
coefficient  of  variation  was  high — about  50  % — and  was  approxi- 
mately constant.  The  reason  why  the  upward  selection  has  not 
covered  so  many  generations  as  the  downward  is  simply  that  the 
increased  resistance  lengthened  the  generation  interval.  (Data 
from  Hunt,  Hoppert,  and  Erwin,  1944.) 

hand  it  is  not  sufficient  to  convert  the  arithmetic  mean  or  variance  to 
logarithms,  unless  the  coefficient  of  variation  is  very  low.  The  con- 
versions may  be  conveniently  made  by  the  two  following  formulae, 
given  by  Wright  (19526).  The  first  converts  the  mean  of  arithmetic 
values  to  the  mean  of  logarithmic  values,  and  the  second  converts  the 
variance  as  computed  from  the  arithmetic  values  to  the  variance  as  it 


(log  x)  =  log  x  -  I  log  ( 1  +  C2) 

o'dogo;)  =0-4343  log  (i+C2) 


.(I7.I) 
,(17.2) 


would  be  computed  from  logarithmic  values.  In  these  formulae  C  is 
the  coefficient  of  variation  in  the  form  ujx  computed  from  arithmetic 
values,  and  the  logarithms  are  to  the  base  10. 
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We  turn  now  to  what  is  perhaps  a  more  fundamental  effect  of  a 
scale  transformation — its  effect  on  the  apparent  nature  of  the  genetic 
variance.  To  understand  this  we  must  go  back  to  a  single  locus  and 
consider  the  effect,  or  mode  of  action,  of  the  genes.  Let  us  imagine  a 
locus  with  two  alleles  whose  mode  of  action  is  geometric,  the  geno- 
typic  value  of  A2A2  being  50  per  cent  greater  than  AXA2  and  that  of 
AXA2  being  also  50  per  cent  greater  than  A-^.  Thus  on  the  logarith- 
mic scale  there  is  no  dominance,  the  heterozygote  being  exactly  mid- 
way between  the  two  homozygotes.  Now  suppose  the  genotypic 
values  are  measured  in  arithmetic  units,  such  as  grams,  and  that  AXAX 
has  a  value  of  10  units.  Then  AXA2  will  be  15  units  and  A2A2  22-5 
units.  On  the  arithmetic  scale,  therefore,  Ax  is  partially  dominant  to 
A2,  the  heterozygote  no  longer  falling  mid-way  between  the  homo- 
zygotes. Thus  the  degree  of  dominance  is  influenced  by  the  scale  of 
measurement,  and  so  also  is  the  proportionate  amount  of  dominance 
variance.  This  effect  of  a  scale  transformation,  however,  is  normally 
rather  small.  A  gene  that  causes  a  50  per  cent  difference  between  the 
genotypic  values,  such  as  we  have  considered,  would  be  a  major  gene, 
easily  recognisable  individually.  But  even  so  the  degree  of  dominance 
on  the  arithmetic  scale  is  not  very  great.  Minor  genes  with  effects  of 
perhaps  1  per  cent  or  10  per  cent  would  be  scarcely  influenced  in  their 
dominance. 

In  the  same  way  that  the  dominance  is  affected  by  the  scale,  so 
also  is  the  epistatic  interaction  between  different  loci.  Loci  with 
geometric  effects  would  combine  without  interaction  if  the  genotypic 
values  were  measured  in  logarithmic  units.  But  when  measured  in 
arithmetic  units  there  would  be  interaction  deviations  due  to  epis- 
tasis.  Thus  the  amount  of  interaction  variance  is  also  influenced  by 
the  scale  of  measurement.  The  following  example  illustrates  the 
dependence  of  interaction  on  scale. 

Example  17.1.  The  pygmy  gene  in  mice  is  a  major  gene  affecting  body 
size,  homozygotes  being  much  reduced  in  size.  The  effect  of  this  gene 
was  studied  in  different  genetic  backgrounds  (King,  1955).  The  gene  was 
transferred  from  the  strain  selected  for  small  size  where  it  arose,  to  a  strain 
selected  for  large  size,  by  repeated  backcrosses.  The  mean  difference  be- 
tween pygmy  homozygotes  and  normals  (i.e.  heterozygotes  and  normal 
homozygotes  together)  was  measured  in  the  two  strains  and  during  the 
transference,  the  comparisons  being  made  between  pygmies  and  normals 
in  the  same  litters.  The  results  are  shown  in  Fig.  17.4.  The  difference 
between  pygmies  and  normals  increases  with  the  weight  of  the  normals. 
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In  the  background  of  the  small  strain  the  pygmies  were  about  7  gm.  smaller 
than  normals,  but  in  the  background  of  the  large  strain  they  were  about 
12  gm.  smaller.  Thus  the  pygmy  gene  shows  epistatic  interaction  with  the 
other  genes  that  affect  body  size.  But  if  the  effect  of  the  gene  is  expressed 
as  a  proportion,  it  is  constant  and  independent  of  the  other  genes  present. 
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Fig.  17.4.  Intra-litter  comparisons  of  the  6-week  weights  of  pyg- 
mies and  normals.  Mean  of  pygmies  plotted  against  mean  of  nor- 
mals in  the  same  litter.  (From  King,  1955;  reproduced  by  courtesy 
of  the  author  and  the  editor  of  the  Journal  of  Genetics.) 

Pygmies  are  about  half  the  weight  of  their  normal  litter-mates,  no  matter 
what  the  actual  weights  are.  Thus  if  the  comparisons  are  made  in  logar- 
ithmic units  there  is  no  epistatic  interaction. 


In  general,  therefore,  a  scale  transformation  may  remove  or 
reduce  the  variance  attributable  to  epistatic  interaction,  and  this 
variance  might  then  be  labelled  as  a  scale  effect.  A  transformation 
which  removes  or  reduces  interaction  variance  may  be  useful  if  con- 
clusions are  to  be  drawn  from  an  analysis  that  depends  for  its  validity 
on  the  absence  of  interaction.  A  detailed  treatment  of  the  relation- 
ship between  scale  and  epistatic  interaction  is  given  by  Horner, 
Comstock,  and  Robinson  (1955). 

In  this  chapter  we  have  outlined  some  of  the  scale  effects  most 
commonly  met  with,  and  have  indicated  the  circumstances  under 
which  a  transformation  of  scale  may  be  helpful  to  the  interpretation 
of  results  and  the  drawing  of  conclusions.  Transformations  of  scale, 
however,  should  not  be  made  without  good  reason.  The  first  pur- 
pose of  experimental  observations  is  the  description  of  the  genetic 
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properties  of  the  population,  and  a  scale  transformation  obscures 
rather  than  illuminates  the  description.  If  epistasis,  for  example,  is 
found,  this  is  an  essential  part  of  the  description,  and  it  is  better 
labelled  as  epistasis  than  as  a  scale  effect.  The  transformation  of  scale 
is  essentially  a  statistical  device  to  be  employed  for  the  purpose  of 
simplifying  the  analysis  of  the  data,  or  to  make  possible  the  drawing 
of  valid  conclusions  from  the  analysis.  It  is  sometimes  helpful  also  in 
the  interpretation  of  results.  If  epistasis,  for  example,  were  found  to 
disappear  on  transformation  to  a  logarithmic  scale  we  could  conclude 
that  the  effects  of  different  loci  combined  by  multiplication  rather 
than  by  addition.  Or,  if  there  were  good  reasons  for  attributing  a 
difference  of  variance  to  a  scale  effect  we  should  not  need  to  invoke 
more  complicated  genetic  explanations.  The  choice  of  scale,  how- 
ever, raises  troublesome  problems  in  connexion  with  the  interpreta- 
tion of  results.  Logical  justification  of  a  scale  transformation  can 
only  come  from  some  criterion  other  than  the  property  about  which 
the  conclusions  are  to  be  drawn.  If  there  is  no  independent  criterion 
the  argument  becomes  circular,  and  the  distinction  between  a  scale 
effect  and  some  other  interpretation  becomes  meaningless.  There  is 
also  a  more  fundamental  difficulty:  the  scale  appropriate  for  one 
population  may  not  be  appropriate  for  another,  and  the  scale  appro- 
priate to  the  genetic  and  environmental  components  of  the  variation 
may  be  different.  This  difficulty  is  strikingly  illustrated  by  an  analysis 
of  the  character  " weight  per  locule"  in  a  number  of  crosses  between 
varieties  of  tomato  (Powers,  1950).  By  the  same  criterion — normality 
of  the  distribution — this  character  was  found  to  require  an  arithmetic 
scale  in  some  crosses  and  a  geometric  scale  in  others;  and,  moreover, 
in  the  F2  generations  of  some  crosses  the  genetic  variation  required  one 
scale  while  the  environmental  variation  required  another. 


CHAPTER    18 

THRESHOLD   CHARACTERS 

There  are  many  characters  of  biological  interest  or  economic  im- 
portance whose  inheritance  is  multifactorial  but  whose  distribution  is 
discontinuous.  For  example:  resistance  to  disease,  a  character  ex- 
pressed either  in  survival  or  in  death  with  no  intermediate;  "litter" 
size  in  the  larger  mammals  that  bear  usually  one  young  at  a  time  but 
sometimes  two  or  three;  or  the  presence  or  absence  of  any  organ  or 
structure.  Characters  of  this  sort  appear  at  first  sight  to  be  outside  the 
realm  of  quantitative  genetics  because  they  do  not  exhibit  continuous 
variation;  yet  when  subjected  to  genetic  analysis  they  are  found  to  be 
under  the  influence  of  many  genes  just  as  any  metric  character.  For 
this  reason  they  have  been  called  "quasi-continuous  variations" 
(Griineberg,  1952):  the  phenotypic  values  are  discontinuous  but  the 
mode  of  inheritance  is  like  that  of  a  continuously  varying  character. 

The  clue  to  the  understanding  of  the  inheritance  of  such  characters 
lies  in  the  idea  that  the  character  has  an  underlying  continuity  with  a 
"threshold"  which  imposes  a  discontinuity  on  the  visible  expression 
of  the  character,  as  depicted  in  Fig.  18.1.  The  underlying  continuous 
variation  is  both  genetic  and  environmental  in  origin,  and  may  be 
thought  of  as  the  concentration  of  some  substance  or  the  speed  of  some 
developmental  process — of  something,  that  is  to  say,  that  could  in 
principle  be  measured  and  studied  as  a  metric  character  in  the 
ordinary  way.  The  hypothetical  measurement  of  this  variation  is 
supposed  to  be  made  on  a  scale  that  renders  its  distribution  normal, 
and  the  unit  of  measurement  is  the  standard  deviation  of  the  dis- 
tribution. This  provides  what  may  be  called  the  underlying  scale.  We 
now  have  two  scales  for  the  description  of  the  phenotypic  values:  the 
underlying  scale  which  is  continuous,  and  the  visible  scale  which  is 
discontinuous.  The  two  are  connected  by  the  threshold,  or  point  of 
discontinuity.  This  is  a  point  on  the  continuous  scale  which  corre- 
sponds with  the  discontinuity  in  the  visible  scale.  The  idea  will  be 
clearer  from  an  inspection  of  Fig.  18.1,  which  depicts  a  character 
whose  visible  expression  can  take  only  two  forms,  such  as  alive  versus 
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dead,  or  present  versus  absent.  Individuals  whose  phenotypic  values 
on  the  underlying  scale  exceed  the  threshold  will  appear  in  one  visible 
class,  while  individuals  below  the  threshold  will  appear  in  the  other. 


-2 


I        +  2       +3  -3        -2        -I 
STANDARD     DEVIATIONS 


+  3 


Fig.  i 8.  i.  Illustrations  of  a  threshold  character  with  two  visible 
classes.  The  vertical  line  marks  the  theshold  between  the  two 
phenotypic  classes,  one  of  which  is  cross-hatched.  The  population 
depicted  on  the  left  has  an  incidence  of  io%;  that  on  the  right,  an 
incidence  of  90  %. 

On  the  visible  scale  individuals  can  have  only  two  values,  o  or  1. 
Groups  of  individuals,  however,  such  as  families  or  the  population  as 
a  whole  can  have  any  value,  in  the  form  of  the  proportion  or  percent- 
age of  individuals  in  one  or  other  class.  This  may  be  referred  to  as 
the  incidence  of  the  character.  Susceptibility  to  disease,  for  example, 
can  be  expressed  as  the  percentage  mortality  in  the  population  or  in 
a  family.  The  incidence  is  quite  adequate  as  a  description  of  the 
population  or  group,  but  the  percentage  scale  in  which  the  incidence 
is  expressed  is  inappropriate  for  some  purposes  because  on  a  per- 
centage scale  variances  differ  according  to  the  mean.  The  interpre- 
tation of  genetic  analyses  of  threshold  characters  is  therefore  facili- 
tated by  the  transformation  of  incidences  to  values  on  the  underlying 
scale.  The  transformation  is  easily  made  by  reference  to  a  table  of 
probabilities  of  the  normal  curve.  The  threshold  is  a  point  of  trun- 
cation whose  deviation  from  the  population  mean  can  be  found  from 
the  proportion  of  the  population  falling  beyond  it.  A  table  of  ''pro- 
bits"  (Fisher  and  Yates,  1943,  Table  ix)  is  convenient  to  use  because 
it  refers  to  a  single  tail  of  the  distribution  and  obviates  confusion 
over  the  sign  of  the  deviation.  The  transformation  from  the  visible 
to  the  underlying  scale  enables  us  to  state  the  mean  phenotypic  value 
of  a  population  or  family  in  terms  of  its  standard  deviation,  and  to 
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compare  the  means  of  different  populations  or  families  provided  they 
have  the  same  standard  deviation.  It  is  convenient  to  take  the  posi- 
tion of  the  threshold  as  the  origin,  or  zero-point,  on  the  underlying 
scale  and  to  express  the  mean  as  a  deviation  from  the  threshold. 
Thus  if  the  incidence  of  the  character  is,  for  example,  10  per  cent,  a 
table  of  the  normal  curve  shows  that  the  threshold  exceeds  the  mean 
by  1-28  standard  deviations.  The  population  mean,  referred  to  the 
threshold  as  origin,  is  therefore  -  1-280-.  Or,  if  the  incidence  were 
90  per  cent  then  the  population  mean  would  be  +  i-28cj,  as  shown  in 
Fig.  1 8. 1.  For  any  comparison  of  means,  however,  it  is  necessary  to 
assume  that  the  populations  compared  have  the  same  variance  on  the 
underlying  scale.  If  reasons  are  known  for  the  variances  not  being 
equal — in  comparisons,  for  example,  between  inbreds,  Fx's  and  F2's — 
then  the  means  cannot  be  expressed  on  a  common  scale  that  allows  a 
valid  comparison  to  be  made. 

This  is  as  far  as  we  can  go  with  a  character  that  is  visibly  expressed 
in  only  two  classes.  The  mean  of  a  population  or  group  can  be  stated, 
but  not  the  variance,  because  the  mean  has  to  be  stated  in  terms  of  the 
standard  deviation.  We  can,  however,  subject  the  observed  means  of 
families  to  analysis  and  compute  the  heritability  of  the  character. 
The  heritability  of  threshold  characters  is  treated  by  A.  Robertson 
and  Lerner  (1949)  and  by  Dempster  and  Lerner  (1950),  and  will  not 
be  further  discussed  here. 

If  a  character  has  three  classes  in  its  visible  scale  then  comparisons 
can  be  made  between  the  variances  of  populations  as  well  as  between 
the  means.  The  number  of  lumbar  vertebrae  in  mice  is  a  character 
of  this  sort  that  has  been  extensively  studied  (Green,  1951;  McLaren 
and  Michie,  1955).  The  number  is  usually  either  5  or  6,  but  some 
individuals  have  5  on  one  side  and  6  on  the  other.  This  comes  about 
through  the  last  vertebra  being  sacralised  on  one  side  and  not  on  the 
other.  The  asymmetrical  mice  have  5!  lumbar  vertebrae  and  are 
regarded  as  being  intermediate  between  the  5 -class  and  the  6-class. 

When  the  visible  scale  has  three  classes  there  are  two  thresholds, 
as  shown  in  Fig.  18.2.  If  the  assumption  is  made  that  the  difference 
between  the  two  thresholds  represents  a  constant  difference  on  the 
underlying  scale,  then  we  have  not  only  a  fixed  origin  of  the  scale  but 
also  a  fixed  unit,  and  this  provides  a  basis  for  the  comparison  of 
variances  as  well  as  of  means.  The  underlying  scale  then  has  one  of 
the  thresholds  as  origin  and  the  threshold  difference  as  the  unit  of 
measurement.    The  idea  is  most  easily  explained  by  a  numerical 
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example.  Consider  the  two  populations  illustrated  in  Fig.  18.2.  Let 
their  standard  deviations  on  a  common  underlying  scale  be  g1  and  o2 
respectively,  and  let  them  have  the  following  incidences  in  the  three 
visible  classes,  X,  I,  and  Z,  of  which  I  is  the  intermediate  class: 


0        *       +5 

POPULATION     (2) 
THRESHOLD     UNITS 

Fig.  18.2.  Illustrations  of  a  threshold  character  with  three  visible 
classes,  in  two  populations  with  incidences  as  shown.  The  axes  are 
marked  in  threshold  units,  and  the  population  means  are  indicated 
by  arrows.    Further  explanation  in  text. 


Class 

X 

I 

Z 

60 

I5 

25 

20 

10 

70 

X/I 

I/Z 

Population  (1) 

+  0-250-! 

+  0-6701 

Population  (2) 

—  0-8403 

-0-520-2 

Incidence,  %.  Population  (1) 
Population  (2) 

The  deviations  of  the  thresholds  from  the  population  means,  found 
from  a  table  of  the  normal  curve,  are  as  follows: 

Threshold  interval 

0-4201 
0-3203 

The  intervals  between  the  two  thresholds,  given  above  on  the  right, 
are  found  by  subtraction  of  the  deviations  of  the  two  thresholds  in 
each  population.  These  threshold  intervals  are  supposed  by  hypo- 
thesis to  be  equal  on  the  common  underlying  scale.  By  assigning  the 
threshold  interval  the  value  of  one  '  'threshold  unit"  we  can  therefore 
express  the  standard  deviations  of  the  two  populations  on  a  common 
basis  in  terms  of  threshold  units.  The  standard  deviations  then 
become 

o1  =  2,38  threshold  units 
ct2  =  3'I2  threshold  units. 
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The  means  of  the  populations  can  also  be  expressed  in  threshold 
units.  Reckoned  from  the  X/I  threshold  as  origin  they  are 

M1  =  -  0-25  01  =  -  o-6o  threshold  units 
M2  =  +  0-84  o-2  =  +  2-62  threshold  units. 

The  standard  deviation  and  population  mean  of  a  character  with 
three  visible  classes  may  be  put  in  general  form  in  the  following  way. 
Let  X  be  the  incidence  in  one  visible  class,  and  Y  the  incidence  in 
this  class  together  with  the  intermediate  class.  Let  the  threshold 
between  these  two  classes  be  the  origin  of  the  underlying  scale.  Let 
x  and  y  be  the  deviations  of  the  two  thresholds  corresponding  to  the 
incidences  X  and  Y  respectively.  Then  the  standard  deviation  is 


and  the  mean  is 


x  -y 

M=  -xg 

-x 


threshold  units 


.(18.1) 


threshold  units 


x 


(18.2) 


The  comparison  of  variances  in  this  way  depends  entirely,  as  we 
have  pointed  out,  on  the  assumption  that  the  interval  between  the 
two  thresholds  is  constant  from  one  population  to  another.  If  we 
think  again  of  the  hypothetical  substance  or  process  whose  concentra- 
tion or  rate  determines  the  value  on  the  underlying  scale,  the  assump- 
tion is  that  the  intermediate  class  spans  the  same  difference  of  con- 
centration or  of  rate  in  the  two  populations  compared.  Whether  this 
assumption  is  a  reasonable  one  or  not  is  hard  to  judge.  It  may, 
nevertheless,  lead  to  reasonable  results,  as  the  following  example 
shows. 

Example  18.1.  The  number  of  lumbar  vertebrae  was  studied  in  two 
inbred  lines  of  mice  and  their  cross  (Green  and  Russell,  195 1).  The  inbred 
lines  were  a  branch  of  the  C3H  strain  with  predominantly  5  lumbar 
vertebrae,  and  the  C57BL  strain  with  predominantly  6  lumbar  vertebrae. 
Crosses  were  made  reciprocally,  and  F2  generations  were  made  from  each 
FP  The  incidences  of  the  5-vertebra  class  and  of  the  intermediate  class  of 
asymmetrical  mice  with  5^  are  given  in  the  table.  The  reciprocal  F/s 
were  found  to  differ  and  are  listed  separately.  The  F2's  did  not  differ  and 
their  results  are  pooled.  The  table  gives  also  the  positions  of  the  two  thresh- 
olds in  standard  deviations;  and  the  mean  and  standard  deviation  com- 
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puted  in  threshold  units,  the  mean  being  reckoned  from  the  threshold 
between  the  5-class  and  the  asymmetrical  class  as  origin.    The  distribu- 


Population 

Incidence,  % 
5               5i 

Deviation  of 

thresholds  from 

mean,  in  a 

5/5*           Si/6 

Mean  and  stand- 
ard deviation  in 
threshold  units 
M                 a 

Inbreds     C3H 

C57 

C3H?  x  C57<? 
C57?xC3H<? 

96-9 
i-3 

2-3 

2-0 

+  1-87 

-2-23 

+  2-41 
-1-84 

-3 '44 

+  574 

1-84 
2-58 

57'4 
29-0 

I5'5 
25-0 

+  OI9 

-o-55 

+  o-6i 
+  o-io 

-0-44 

+  0-85 

2-36 
i-53 

F2  (pooled) 

46-7 

12*2 

-0-08 

+  0-23 

+  0-27 

3-25 

tions  of  the  populations,  based  on  the  computed  means  and  standard 
deviations,  are  shown  graphically  in  Fig.  18.3.  It  should  be  noted  that  the 
means  and  standard  deviations  of  the  inbreds  are  not  very  precisely  esti- 
mated because  the  incidences  are  low.  The  computed  properties  of  the 
populations  follow  the  expected  pattern.  The  Fx  generation  is  intermediate 
in  mean  between  the  two  parental  populations,  though  there  is  a  maternal 
effect  causing  a  difference  between  the  reciprocal  F/s.  This  maternal 
effect  has  been  further  studied  and  confirmed  by  McLaren  and  Michie 
(1956a).  The  variance  of  the  F1  is  somewhat  lower  than  that  of  the 
parental  inbreds,  as  might  be  expected  from  a  reduction  of  environmental 
variance  in  the  hybrids.  This  was  further  studied  and  confirmed  by 
McLaren  and  Michie  (1955).  The  F2  is  equal  in  mean  to  the  Fl9  but  shows 
an  increased  variance  as  would  be  expected  from  the  segregation  of  genes. 
If  we  take  2-00  as  the  mean  standard  deviation  of  the  Flf  representing 
purely  environmental  variation,  then  the  environmental  variance  is  4-00, 
and  the  total  phenotypic  variance  given  by  the  F2  is  10-56;  therefore  the 
genotypic  variance  works  out  at  6-56,  or  62  per  cent  of  the  total.  Thus  the 
analysis  of  the  threshold  character  studied  in  this  cross  leads  to  very 
reasonable  results,  and  the  assumptions  on  which  it  rests  do  not  seem  to  be 
very  seriously  wrong. 


The  meaning  of  the  threshold  unit  in  which  values  on  the  under- 
lying scale  are  expressed  may  conveniently  be  discussed  by  reference 
to  the  number  of  lumbar  vertebrae  in  mice,  described  in  the  above 
example.  From  the  graduation  of  the  scale  at  the  foot  of  Fig.  18.3 
it  appears  that  the  threshold  interval  corresponds  to  one  vertebra.  It 
is  therefore  tempting  to  regard  the  scale  as  indicating  '  'potential' 
vertebrae,  ranging  from  5  at  the  origin  to  15  at  the  upper  extreme 
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-5  THRESHOLD  UNITS    +5 

5 ►M- 6 ► 

VERTEBRAE 

Fig.  18.3.  Distributions  of  number  of  lumbar  vertebrae  in  mice 
transformed  to  the  underlying  scale  of  threshold  units.  The  upper 
distributions  are  two  inbred  lines,  the  two  middle  ones  are  the  two 
reciprocal  F/s,  and  the  lower  distribution  is  the  F2.  (Data  from 
Green  &  Russell,  1951-)   See  example  18.1  for  further  explanation. 
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and  to  -  5  at  the  lower  extreme.  We  should  then  regard  the  develop- 
ing vertebral  column  as  being  protected  by  canalisation  against  this 
wide  range  of  potential  variation,  so  that  the  vertebrae  actually 
formed  are  restricted  to  the  narrow  range  between  5  and  6.  This 
interpretation,  however,  assumes  that  individuals  with  a  potential 
number  anywhere  between  5  and  6  will  be  asymmetrical  with  5! 
vertebrae;  and  for  this  there  is  no  justification.  The  asymmetrical 
individuals  may  equally  well,  or  more  probably,  be  those  with  almost 
exactly  5 \  potential  vertebrae.  Suppose,  for  example,  that  the  range 
of  potential  vertebrae  that  gave  rise  to  an  asymmetrical  individual 
were  between  5-4  and  5-6.  Then  1  threshold  unit  would  correspond 
to  o-2  potential  vertebrae;  the  origin  of  the  underlying  scale  would 
be  at  5-4  and  the  variation  would  range  from  7-4  potential  vertebrae 
at  one  extreme  to  3-4  at  the  other.  Or,  if  the  asymmetrical  individuals 
covered  a  range  of  only  o-i  potential  vertebrae,  the  whole  distribu- 
tion would  lie  within  the  potential  numbers  of  5  and  6,  just  as  the 
actual  range  does.  Thus  the  threshold  unit  is  purely  arbitrary  in 
nature;  though  useful  for  the  comparison  of  populations,  it  cannot  be 
given  any  concrete  interpretation. 

From  what  has  been  said  so  far  in  this  chapter  it  will  be  clear  that 
threshold  characters  do  not  provide  ideal  material  for  the  study  of 
quantitative  genetics,  because  the  genetic  analyses  to  which  they  can 
be  subjected  are  limited  in  scope  and  subject  to  assumptions  that  one 
would  be  unwilling  to  make  except  under  the  force  of  necessity.  We 
turn  now  to  a  consideration  of  some  aspects  of  selection  for  threshold 
characters,  which  has  more  practical  importance  than  the  genetic 
analyses  that  we  have  been  considering,  and  does  not  involve  the  same 
theoretical  difficulties. 


Selection  for  Threshold  Characters 

Selection  for  threshold  characters  has  some  practical  importance 
in  connexion  with  the  improvement  of  viability  and  with  changing 
the  response  of  experimental  animals  to  treatments,  such  as,  for 
example,  increasing  or  decreasing  drug  resistance.  We  shall  consider 
only  characters  with  two  visible  classes;  and  we  shall  assume  that 
there  is  no  means  of  measuring  some  aspect  of  the  character  that 
varies  continuously,  such  as  measuring  the  time  of  survival  instead  of 
classifying  simply  dead  versus  alive. 
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The  response  to  selection  depends  in  the  usual  way  on  the  selec- 
tion differential.  But  the  selection  differential  does  not  depend  prim- 

I  arily  on  the  proportion  selected,  as  with  a  continuously  varying 
character,  but  on  the  incidence,  for  the  following  reason.    We  may 

I  breed  exclusively  from  those  individuals  in  the  desired  phenotypic 

i  class,  but  we  cannot  discriminate  between  those  with  high  and  those 
with  low  values  on  the  underlying  scale.  The  selected  individuals  are 
therefore  a  random  sample  from  the  desired  class,  and  the  mean  of 
the  selected  individuals  is  the  mean  of  the  desired  class,  irrespective 
of  whether  we  select  all  of  the  desired  class  or  only  a  portion  of  it. 
The  point  will  be  made  clearer  by  reference  to  Fig.  18.1,  letting  the 
cross-hatching  represent  the  desired  class.  Let  us  suppose  that  the 
replacement  rate  allows  us  to  select  10  per  cent  of  the  population.  If 
we  select  out  of  the  population  on  the  right,  with  an  incidence  of  90 

I  per  cent,  the  mean  of  the  selected  individuals  will  be  the  same  as  if 
we  had  selected  90  per  cent.  But  if  we  select  out  of  the  population 
on  the  left,  with  an  incidence  of  10  per  cent,  we  shall  use  all  of  the 
individuals  in  the  desired  class  and  none  of  the  others.  The  selection 
differential  will  then  be  the  same  as  if  we  had  selected  on  the  basis  of 
a  continuously  varying  character.  Thus  the  selection  differential  is 
greatest  when  the  incidence  is  exactly  equal  to  the  proportion  selected. 
If  it  is  less  we  shall  be  forced  to  use  some  individuals  of  the  un- 
desired  class;  and  if  it  is  greater  we  shall  do  no  better  than  we  should 
by  selecting  the  whole  of  the  desired  class. 

With  some  characters,  however,  the  incidence  can  be  altered  and 
this  provides  a  means  of  improving  the  response  to  selection.  If  the 
character  is,  for  example,  a  reaction  to  some  treatment,  the  treatment 
can  be  increased  or  reduced  in  intensity,  so  that  the  incidence  is 
altered.  This  is  an  alteration  of  the  mean  level  of  the  environment, 
and  its  effect  is  in  principle  to  shift  the  distribution  of  phenotypic 
values  with  respect  to  the  fixed  threshold.  But  it  is  more  con- 
venient to  regard  it  as  changing  the  nature  of  the  character  and  shift- 
ing the  threshold  with  respect  to  a  fixed  mean  phenotypic  level. 
When  the  level  of  the  threshold  can  be  controlled  in  this  way,  the 
maximum  speed  of  progress  under  selection  will  be  attained  by  ad- 
justing the  threshold  so  that  the  incidence  is  kept  as  nearly  as  possible 
equal  to  the  minimum  proportion  that  must  be  selected  for  breeding. 
The  progress  made  can  be  assessed  by  subjecting  the  population,  or 
part  of  it,  to  the  original  treatment  under  which  the  threshold  is  at  its 
original  level. 
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Genetic  assimilation.  A  very  interesting  result  of  the  applica- 
tion of  this  principle  of  changing  the  threshold  by  environmental 
means  is  the  phenomenon  known  as  "genetic  assimilation"  (Wad- 
dington,  1953).  If  a  threshold  character  appears  as  a  result  of  an 
environmental  stimulus,  and  selection  is  applied  for  this  character,  it 
may  eventually  be  made  to  appear  spontaneously,  without  the  neces- 
sity of  the  environmental  stimulus.  In  this  way  what  was  originally 
an  "acquired  character"  becomes  by  perfectly  orthodox  principles  of 
selection  an  "inherited  character"  (Waddington,  1942).  In  such  a 
situation  there  are  two  thresholds,  one  spontaneous  and  the  other 


4  t  6 

SPONTANEOUS 

Fig.  1 8.4.  Diagram  illustrating  genetic  assimilation  of  a  threshold 
character.  Distributions  on  the  underlying  scale,  which  is  marked 
in  standard  deviations.  The  vertical  lines  show  the  positions  of  the 
induced  and  spontaneous  thresholds,  and  the  arrows  mark  the 
population  means  at  three  stages  of  selection. 

(a)  before  selection:  incidence — induced  =   30  %,  spontaneous  =   o  % 

(b)  after  some  selection:      incidence — induced  =   80  %,  spontaneous  =   2  %  ■■ 

(c)  after  further  selection:  incidence — induced  =  100  %,  spontaneous  =95  % 

induced,  as  shown  in  Fig.  18.4.  The  spontaneous  threshold  is  at  first 
outside  the  range  of  variation  of  the  population,  so  that  there  is  no 
variation  of  phenotype  and  no  selection  can  be  applied,  (Fig.  18.4,  a). 
The  induced  threshold,  however,  is  within  the  range  of  the  under- 
lying scale  covered  by  the  population,  and  it  allows  individuals  toward 
one  end  of  the  distribution  to  be  picked  out  by  selection.  In  this  way 
the  mean  genotypic  value  of  the  population  is  changed.  If  this  change 
goes  far  enough  some  individuals  will  eventually  cross  the  spon- 
taneous threshold  and  appear  as  spontaneous  variants,  (Fig.  18.4,  b). 
When  the  spontaneous  incidence  becomes  high  enough  selection  may 
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be  continued  without  the  aid  of  the  environmental  stimulus,  and  the 
spontaneous  incidence  may  be  further  increased,  (Fig.  18.4,  c). 

Example  18.2.  An  experimental  demonstration  of  genetic  assimilation 
in  Drosophila  melanogaster  is  described  by  Waddington  (1953).  The  charac- 
ter was  the  absence  of  the  posterior  cross-vein  of  the  wing.  In  the  base 
population  no  flies  with  this  abnormality  were  present,  but  treatment  of 
the  puparium  by  heat  shock  caused  about  30  per  cent  of  cross-veinless 
individuals  to  appear.  Selection  in  both  directions  was  applied  to  the 
treated  flies,  and  after  14  generations  the  incidence  of  the  induced  character 
had  risen  to  80  per  cent  and  fallen  to  8  per  cent.  At  this  time  cross-veinless 
flies  began  to  appear  in  small  numbers  among  untreated  flies  of  the  upward- 
selected  line,  and  by  generation  16  the  spontaneous  incidence  was  between 
1  and  2  per  cent.  Selection  was  then  continued  without  treatment,  the 
population  being  subdivided  into  a  number  of  lines.  The  best  four  of  the 
lines,  selected  without  further  treatment,  reached  spontaneous  incidences 
ranging  from  67  per  cent  to  95  per  cent.  The  distributions  in  Fig.  18.4 
illustrate  the  progress  of  the  upward  selection.  Graph  (b)  shows  a  spon- 
taneous incidence  of  2  per  cent  and  an  induced  incidence  of  80  per  cent 
and  thus  corresponds  approximately  with  generation  16.  On  the  assump- 
tion of  constant  variance,  the  change  of  mean  at  this  stage  amounted  to 
1-36  standard  deviations.  Graph  (c)  shows  a  spontaneous  incidence  of 
95  per  cent  and  represents  the  line  that  finally  showed  the  greatest  pro- 
gress. Its  mean  on  the  underlying  scale  is  5-15  standard  deviations  above 
that  of  the  initial  population. 

The  idea  of  genetic  assimilation  is  not  confined  to  threshold 
characters;  but  for  its  wider  significance  the  reader  must  be  referred 
to  Waddington  (1957). 


F.Q.G. 


CHAPTER    19 

CORRELATED   CHARACTERS 

This  chapter  deals  with  the  relationships  between  two  metric  charac- 
ters, in  particular  with  characters  whose  values  are  correlated — 
either  positively  or  negatively — in  the  individuals  of  a  population. 
Correlated  characters  are  of  interest  for  three  chief  reasons.  Firstly 
in  connexion  with  the  genetic  causes  of  correlation  through  the 
pleiotropic  action  of  genes:  pleiotropy  is  a  common  property  of  major 
genes,  but  we  have  as  yet  had  little  occasion  to  consider  its  effects  in 
quantitative  genetics.  Secondly  in  connexion  with  the  changes 
brought  about  by  selection:  it  is  important  to  know  how  the  im- 
provement of  one  character  will  cause  simultaneous  changes  in  other 
characters.  And  thirdly  in  connexion  with  natural  selection:  the 
relationship  between  a  metric  character  and  fitness  is  the  primary 
agent  that  determines  the  genetic  properties  of  that  character  in  a 
natural  population.  This  last  point,  however,  will  be  discussed  in 
the  next  chapter. 


Genetic  and  Environmental  Correlations 

In  genetic  studies  it  is  necessary  to  distinguish  two  causes  of  cor- 
relation between  characters,  genetic  and  environmental.  The  genetic 
cause  of  correlation  is  chiefly  pleiotropy,  though  linkage  is  a  cause  of 
transient  correlation  particularly  in  populations  derived  from  crosses 
between  divergent  strains.  Pleiotropy  is  simply  the  property  of  a 
gene  whereby  it  affects  two  or  more  characters,  so  that  if  the  gene  is 
segregating  it  causes  simultaneous  variation  in  the  characters  it 
affects.  For  example,  genes  that  increase  growth  rate  increase  both 
stature  and  weight,  so  that  they  tend  to  cause  correlation  between 
these  two  characters.  Genes  that  increase  fatness,  however,  influence 
weight  without  affecting  stature,  and  are  therefore  not  a  cause  of 
correlation.  The  degree  of  correlation  arising  from  pleiotropy  ex- 
presses the  extent  to  which  two  characters  are  influenced  by  the  same 
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genes.  But  the  correlation  resulting  from  pleiotropy  is  the  overall,  or 
net,  effect  of  all  the  segregating  genes  that  affect  both  characters. 
Some  genes  may  increase  both  characters,  while  others  increase  one 
and  reduce  the  other;  the  former  tend  to  cause  a  positive  correlation, 
the  latter  a  negative  one.  So  pleiotropy  does  not  necessarily  cause  a 
detectable  correlation.  The  environment  is  a  cause  of  correlation  in 
so  far  as  two  characters  are  influenced  by  the  same  differences  of 
environmental  conditions.  Again,  the  correlation  resulting  from  en- 
vironmental causes  is  the  overall  effect  of  all  the  environmental 
factors  that  vary;  some  may  tend  to  cause  a  positive  correlation,  others 
a  negative  one. 

The  association  between  two  characters  that  can  be  directly 
observed  is  the  correlation  of  phenotypic  values,  or  the  phenotypic 
correlation.  This  is  determined  from  measurements  of  the  two 
characters  in  a  number  of  individuals  of  the  population.  Suppose, 
however,  that  we  knew  not  only  the  phenotypic  values  of  the  indi- 
viduals measured,  but  also  their  genotypic  values  and  their  environ- 
mental deviations  for  both  characters.  We  could  then  compute  the 
correlation  between  the  genotypic  values  of  the  two  characters  and 
the  correlation  between  the  environmental  deviations,  and  so  assess 
independently  the  genetic  and  environmental  causes  of  correlation. 
And  if,  in  addition,  we  knew  the  breeding  values  of  the  individuals,  we 
could  determine  also  the  correlation  of  breeding  values.  In  principle 
there  are  also  correlations  between  dominance  deviations,  and  be- 
tween the  various  interaction  deviations.  To  deal  with  all  these  cor- 
relations, even  in  theory,  would  be  unmanageably  complex,  and 
fortunately  is  not  necessary,  since  the  practical  problems  can  be  quite 
adequately  dealt  with  in  terms  of  two  correlations.  These  are  the 
genetic  correlation,  which  is  the  correlation  of  breeding  values,  and 
the  environmental  correlation,  which  is  not  strictly  speaking  the  cor- 
relation of  environmental  deviations,  but  the  correlation  of  environ- 
mental deviations  together  with  non-additive  genetic  deviations.  In 
other  words,  just  as  the  partitioning  of  the  variance  of  one  charac- 
ter into  the  two  components,  additive  genetic  versus  all  the  rest, 
was  adequate  for  many  purposes,  so  now  the  covariance  of  two 
characters  need  only  be  partitioned  into  these  same  two  compon- 
ents. The  '  'genetic"  and  " environmental"  correlations  thus  corres- 
pond to  the  partitioning  of  the  covariance  into  the  additive  genetic 
component  versus  all  the  rest.  The  methods  of  estimating  these 
two  correlations  will  be  explained  later.    Let  us  consider  first  how 
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they  combine  together  to  give  the  directly  observable  phenotypic 
correlation. 

The  following  symbols  will  be  used  throughout  this  chapter: 

X  and  Y:  the  two  characters  under  consideration. 

rP  the  phenotypic  correlation  between  the  two  characters, 

XandY. 
rA  the  genetic  correlation  between  X  and  Y  (i.e.  the 

correlation  of  breeding  values). 
rE  the    environmental    correlation    between    X    and    Y 

(including  non-additive  genetic  effects). 
cov  the  covariance  of  the  two  characters  X  and  Y,  with 

subscripts  P,  A,  or  E,  having  the  same  meaning  as  for 

the  correlations. 
cr2  and  g   variance    and    standard    deviation,    with    subscripts 

P,  A,  or  E,  as  above,  and  X  or  Y  according  to  the 

character  referred  to.    E.g.  gpx  =  phenotypic  variance 

of  character  X. 
h2  the  heritability,  with  subscript  X  or  Y,  according  to 

the  character. 
e2  =  i  -  h2. 

(The  customary  symbol  for  the  genetic  correlation  is  rG,  but  since  the 
genetic  correlation  is  almost  always  the  correlation  of  breeding  values 
we  shall  use  the  symbol  rA  for  the  sake  of  consistency  with  previous 
chapters.) 

A  correlation,  whatever  its  nature,  is  the  ratio  of  the  appropriate 
covariance  to  the  product  of  the  two  standard  deviations.  For 
example,  the  phenotypic  correlation  is 

COVp 

rP 


GPXGPy 


The  phenotypic  covariance  is  the  sum  of  the  genetic  and  environ- 
mental covariances,  so  we  can  write  the  phenotypic  correlation  as 

_covA+covE 

rp  — 

vpxVpy 

The  denominator  can  be  differently  expressed  by  the  following 
device:  g\  —  h2GP,  and  g%  —  ^gp.  So  GP—GAjh=GEje.  The  phenotypic 
correlation  then  becomes 


k. 
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7     j        covA  COVE 

rP  =  hxhY- — —+exeY 


aAX?AY 


^ex^ey 


Therefore 


rP=hxhYrA+exeYrE 


,{l9.l) 


This  shows  how  the  genetic  and  environmental  causes  of  correlation 
combine  together  to  give  the  phenotypic  correlation.  If  both 
characters  have  low  heritabilities  then  the  phenotypic  correlation  is 
determined  chiefly  by  the  environmental  correlation:  if  they  have 
high  heritabilities  then  the  genetic  correlation  is  the  more  important. 

The  genetic  and  environmental  correlations  are  often  very  differ- 
ent in  magnitude  and  sometimes  different  even  in  sign,  as  may  be 
seen  from  the  examples  given  in  Table  19.1.  A  difference  in  sign 
between  the  two  correlations  shows  that  genetic  and  environmental 
sources  of  variation  affect  the  characters  through  different  physio- 
logical mechanisms.  The  correlations  between  body-weight  and  egg- 
laying  characters  in  poultry  provide  striking  examples.  Pullets  that 
are  larger  at  18  weeks  from  genetic  causes  reach  sexual  maturity  later 
and  lay  fewer  eggs,  but  the  eggs  are  larger.  Pullets  that  are  larger 
from  environmental  causes  reach  sexual  maturity  earlier  and  lay 
more  eggs,  which  however  are  very  little  different  in  size. 

The  dual  nature  of  the  phenotypic  correlation  makes  it  clear  that 
the  magnitude  and  even  the  sign  of  the  genetic  correlation  cannot  be 
determined  from  the  phenotypic  correlation  alone.  Let  us  therefore 
consider  the  methods  by  which  the  genetic  correlation  can  be 
estimated. 

Estimation  of  the  genetic  correlation.  The  estimation  of 
genetic  correlations  rests  on  the  resemblance  between  relatives  in  a 
manner  analogous  to  the  estimation  of  heritabilities  described  in 
Chapter  10.  Therefore  only  the  principle  and  not  the  details  of  the 
procedure  need  be  described  here.  Instead  of  computing  the  com- 
ponents of  variance  of  one  character  from  an  analysis  of  variance,  we 
compute  the  components  of  covariance  of  the  two  characters  from  an 
analysis  of  covariance  which  takes  exactly  the  same  form  as  the  ana- 
lysis of  variance.  Instead  of  starting  from  the  squares  of  the  individual 
values  and  partitioning  the  sums  of  squares  according  to  the  source 
of  variation,  we  start  from  the  product  of  the  values  of  the  two 
characters  in  each  individual  and  partition  the  sums  of  products 
according  to  the  source  of  variation.  This  leads  to  estimates  of  the 
observational   components   of  covariance,   whose   interpretation   in 
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Table  19.  i 

Some  Examples  of  Phenotypic,  Genetic,  and 
Environmental  Correlations 

The  environmental  correlations  (except  those  marked*)  were 
calculated  for  this  table  from  the  genetic  correlations  and 
heritabilities  given  in  the  papers  cited,  by  equation  ig.i. 
They  are  not  purely  environmental  in  causation  but  include 
correlation  due  to  non-additive  genetic  causes,  as  explained 
in  the  text.  Those  marked*  are  true  environmental  correla- 
tions, estimated  directly  from  the  phenotypic  correlation  in 
inbred  lines  and  crosses. 

rP  rA  rE 

Cattle  (Johansson,  1950) 
Milk-yield  :  butterfat-yield. 
Milk-yield  :  butterfat  %. 
Butterfat-yield  :  butterfat  %. 

Pigs  (Fredeen  and  Jonsson,  1957) 
Body  length  :  backfat  thickness. 
Growth  rate  :  feed  efficiency. 
Backfat  thickness  :  feed  efficiency. 

Sheep  (Morley,  1955) 

Fleece  weight :  length  of  wool. 
Fleece  weight :  crimps  per  inch. 
Fleece  weight  :  body  weight. 

Poultry  (Dickerson,  1957) 

Body  weight :  egg-production. 

(at  18  weeks)  (to  72  weeks  of  age) 

Body  weight :  egg  weight. 

(at  18  weeks) 

Body  weight :  age  at  first  egg.  -  -30  -29         -  -50 

(at  18  weeks) 

Mice  (Falconer,  1954&) 

Body  weight :  tail  length  (within  litters).  -44  -59  -34 

Drosophila  melanogaster 

Bristle  number,  abdominal :  sternopleural.  -06  -08  -04 

(Clayton,  Knight,  Morris,  and  Robert- 
son, 1957) 

Number  of  bristles  on  different  abdominal 

segments.  (Reeve  and  Robertson,  1954)  —  '96  -05 

Thorax  length  :  wing  length. 

(Reeve  and  Robertson,  1953)  —  75  '5° 


•93 
-•14 

•23 

•85 

-•20 

•26 

.96 

-•10 

•22 

-•24 

-.84 

•31 

-•47 
-•96 

•28 

-•01 

-•50 
.32 

.30 

-•21 
.36 

-•02 
-  -II 

1-17 

•10 

1-05 

•09 

-•l6 

•18 

•l6 

•50 

-•05 
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terms  of  causal  components  of  covariance  is  exactly  the  same  as  that 
of  the  components  of  variance  given  in  Table  10.4.  Thus,  in  an 
analysis  of  half-sib  families  the  component  of  covariance  between 
sires  estimates  \covAy  i.e.  one  quarter  of  the  covariance  of  breeding 
values  of  the  two  characters.  For  the  estimation  of  the  correlation 
the  components  of  variance  of  each  character  are  also  needed.  Thus 
the  between-sire  components  of  variance  estimate  laAX  and  \v\Y- 
Therefore  the  genetic  correlation  is  obtained  as 

m 


s/varx  varY 

where  var  and  cov  refer  to  the  components  of  variance  and  covariance. 
The  offspring-parent  relationship  can  also  be  used  for  estimating 
the  genetic  correlation.  To  estimate  the  heritability  of  one  character 
from  the  resemblance  between  offspring  and  parents  we  compute 
the  covariance  of  offspring  and  parent  for  the  one  character  by 
taking  the  product  of  the  parent  or  mid-parent  value  and  the  mean 
value  of  the  offspring.  To  estimate  the  genetic  correlation  between 
two  characters  we  compute  what  might  be  called  the  "cross- 
covariance,"  obtained  from  the  product  of  the  value  of  X  in  parents 
and  the  value  of  Y  in  offspring.  This  "cross-co variance"  is  half  the 
genetic  covariance  of  the  two  characters,  i.e.  \covA.  The  covariances 
of  offspring  and  parents  for  each  of  the  characters  separately  are  also 
needed,  and  then  the  genetic  correlation  is  given  by 

Cm^=^  (19-3) 


vCOVxx  covYy 

where  cov^y  is  the  "cross-covariance,"  and  covXx  and  covYy  are  tne 
offspring-parent  covariances  of  each  character  separately. 

The  genetic  correlation  can  also  be  estimated  from  responses  to 
selection  in  a  manner  analogous  to  the  estimation  of  realised  herit- 
ability. This  will  be  explained  in  the  next  section. 

Data  that  provide  estimates  of  genetic  correlations  provide  also 
estimates  of  the  heritabilities  of  the  correlated  characters,  and  of  the 
phenotypic  correlations.  The  environmental  correlation  can  then  be 
found  from  equation  ig.i.  If  highly  inbred  lines  are  available  the 
environmental  correlations  can  be  estimated  directly  from  the 
phenotypic  correlation  within  the  lines,  or  preferably  within  the  F/s 
of  crosses  between  the  lines. 

Estimates  of  genetic  correlations  are  usually  subject  to  rather 
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large  sampling  errors  and  are  therefore  seldom  very  precise.  The 
sampling  variance  of  genetic  correlations  is  treated  by  Reeve  (1955^) 
and  by  A.  Robertson  (19596).  The  standard  error  of  an  estimate 
is  given  approximately  by  the  following  formula  : 


U(rA)  ~ 


V<*(ft|)  <*0 


where  <r  denotes  standard  error.  Since  the  standard  errors  of  the  two 
heritabilities  appear  in  the  numerator,  an  experiment  designed  to 
minimise  the  sampling  variance  of  an  estimate  of  heritability,  in  the 
manner  described  in  Chapter  10,  will  also  have  the  optimal  design  for 
the  estimation  of  a  genetic  correlation. 


Correlated  Response  to  Selection 


The  next  problem  for  consideration  concerns  the  response  to 
selection:  if  we  select  for  character  X,  what  will  be  the  change  of  the 
correlated  character  Y?  The  expected  response  of  a  character,  Y, 
when  selection  is  applied  to  another  character,  X,  may  be  deduced  in 
the  following  way.  The  response  of  character  X — i.e.  the  character 
directly  selected — is  equivalent  to  the  mean  breeding  value  of  the 
selected  individuals.  This  was  explained  in  Chapter  11.  The  conse- 
quent change  of  character  Y  is  therefore  given  by  the  regression  of  the 
breeding  value  of  Y  on  the  breeding  value  of  X.  This  regression  is 

_covA_      GAY 

°(A)YX——^ —  'A 

<*AX  &AX 

The  response  of  character  X,  directly  selected,  by  equation  11. 4,  is 

Rx  =  ihx°Ax 
Therefore  the  correlated  response  of  character  Y  is 
CRY=bU)YXRx 

■j  (JAY 

=inxaAxrA 

°AX 

=ihxrA°AY  (J9-' 

Or,  by  putting  gay  —  hYcrPYi  the  correlated  response  becomes 

CRY = ihxhYrAaPY  (I9-5) 


. 
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Thus  the  response  of  a  correlated  character  can  be  predicted  if 
the  genetic  correlation  and  the  heritabilities  of  the  two  characters  are 
known.  And,  conversely,  if  the  correlated  response  is  measured  by 
experiment,  and  the  two  heritabilities  are  known,  the  genetic  corre- 
lation can  be  estimated.  If  the  heritability  of  character  Y  is  to  be 
estimated  as  the  realised  heritability  from  the  response  to  selection, 
then  it  is  necessary  to  do  a  double  selection  experiment.  Character  X 
is  selected  in  one  line  and  character  Y  in  another.  Then  both  the 
direct  and  the  correlated  responses  of  each  character  can  be  measured. 
This  type  of  experiment  provides  two  estimates  of  the  genetic  corre- 
lation (by  equation  19.5),  one  from  the  correlated  response  of  each 
character;  and  the  two  estimates  should  agree  if  the  theory  of  corre- 
lated responses  expressed  in  equation  J9.5  adequately  describes  the 
observed  responses  (Falconer,  1954&).  A  joint  estimate  of  the  genetic 
correlation  can  be  obtained  from  such  double  selection  experiments, 
without  the  need  for  estimates  of  the  heritabilities,  from  the  following 
formula  which  may  be  easily  derived  from  equations  11. 4  and  19.4: 


rA  = 


Ry-     Rxr 


.(i9.6) 


Example  19. i.  In  a  study  of  wing  length  and  thorax  length  in  Droso- 
phila  melanogaster,  Reeve  and  Robertson  (1953)  estimated  the  genetic 
correlation  between  these  two  measures  of  body  size  from  the  responses  to 
selection.  There  were  two  pairs  of  selection  lines;  one  pair  was  selected  for 
increased  and  for  decreased  thorax  length,  and  the  other  pair  for  increased 
and  for  decreased  wing  length.  In  each  line  the  correlated  response  of  the 
character  not  directly  selected  was  measured,  as  well  as  the  response  of  the 
character  directly  selected.  Two  estimates  of  the  genetic  correlation  were 
obtained  by  equation  J9.6,  one  from  the  responses  to  upward  selection  and 
the  other  from  the  responses  to  downward  selection.  In  addition,  estimates 
of  the  genetic  correlation  in  the  unselected  population  were  obtained  from 
the  offspring-parent  covariance  and  also  from  the  full-sib  co variance.  The 
four  estimates  were  as  follows: 


Method 

Genetic  correlation 

Offspring-parent 

074 

Full  sib 

075 

Selection,  upward 

071 

Selection,  downward 

o73 

The  agreement  between  the  estimates  from  selection  and  the  estimates 
from  the  unselected  population  shows  that  the  correlated  responses  were 
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very  close  to  what  would  have  been  predicted  from  the  genetic  analysis  of 
the  unselected  population. 


Close  agreement  between  observed  and  predicted  correlated 
responses,  such  as  was  shown  in  the  above  example,  cannot  always  be 
expected,  particularly  if  the  genetic  correlation  is  low.  With  a  low 
genetic  correlation  the  expected  response  is  small  and  is  liable  to  be 
obscured  by  random  drift  (see  Clayton,  Knight,  Morris  and  Robert- 
son, 1957).  Also,  if  the  genetic  correlation  is  to  any  great  extent 
caused  by  linkage,  it  is  likely  to  diminish  in  magnitude  through 
recombination,  with  a  consequent  diminution  of  the  correlated 
response.  There  has  not  yet  been  enough  experimental  study  of 
correlated  responses  to  allow  us  to  draw  any  conclusions  about  the 
number  of  generations  over  which  they  continue,  nor  about  the  total 
response  when  the  limit  is  reached. 

Indirect  selection.  Consideration  of  correlated  responses  sug- 
gests that  it  might  sometimes  be  possible  to  achieve  more  rapid  pro- 
gress under  selection  for  a  correlated  response  than  from  selection  for 
the  desired  character  itself.  In  other  words,  if  we  want  to  improve 
character  X,  we  might  select  for  another  character,  Y,  and  achieve 
progress  through  the  correlated  response  of  character  X.  We  shall 
refer  to  this  as  "indirect"  selection;  that  is  to  say,  selection  applied  to 
some  character  other  than  the  one  it  is  desired  to  improve.  And  we 
shall  refer  to  the  character  to  which  selection  is  applied  as  the 
"secondary"  character.  The  conditions  under  which  indirect  selec- 
tion would  be  advantageous  are  readily  deduced.  Let  Rx  be  the 
direct  response  of  the  desired  character,  if  selection  were  applied 
directly  to  it.  And  let  CRX  be  the  correlated  response  of  character  X 
resulting  from  selection  applied  to  the  secondary  character,  Y.  The 
merit  of  indirect  selection  relative  to  that  of  direct  selection  may  then 
be  expressed  as  the  ratio  of  the  expected  responses,  CRX/RX.  Taking 
the  expected  correlated  response  from  equation  19.4  and  the  expected 
direct  response  from  equation  11. 4,  we  find 

CRX  =tY^YrA^AX 

Rx        ixhx°AX 

lx    nx 
If  the  same  intensity  of  selection  can  be  achieved  when  selecting  for 
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character  Y  as  when  selecting  for  character  X,  then  the  correlated 
response  will  be  greater  than  the  direct  response  if  rAhY  is  greater 
than  hx.  Therefore  indirect  selection  cannot  be  expected  to  be 
superior  to  direct  selection  unless  the  secondary  character  has  a 
substantially  higher  heritability  than  the  desired  character,  and  the 
genetic  correlation  between  the  two  is  high;  or,  unless  a  substantially 
higher  intensity  of  selection  can  be  applied  to  the  secondary  than  to 
the  desired  character.  The  circumstances  most  likely  to  render 
indirect  selection  superior  to  direct  selection  are  chiefly  concerned 
with  technical  difficulties  in  applying  selection  directly  to  the  desired 
character.   Two  such  technical  difficulties  may  be  mentioned  briefly. 

i .  If  the  desired  character  is  difficult  to  measure  with  precision, 
the  errors  of  measurement  may  so  reduce  the  heritability  that  indirect 
selection  becomes  advantageous.  Threshold  characters  in  general  are 
likely  for  this  reason  to  repay  a  search  for  a  suitable  correlated  charac- 
ter, unless  the  position  of  the  threshold  can  be  adjusted  in  the  manner 
described  in  the  last  chapter.  An  interesting  experimental  result  which 
may  well  prove  to  be  an  example  of  indirect  selection  being  superior 
to  direct  selection  concerns  sex  ratio  in  mice.  The  sex  ratio  among 
the  progeny  may  be  regarded  as  a  metric  character  of  the  parents. 
Selection  applied  directly  to  sex  ratio  was  ineffective  in  changing  it 
(Falconer,  1954c),  but  selection  for  blood-pH  produced  a  correlated 
change  of  sex  ratio  (Weir  and  Clark,  1955;  Weir,  1955).  The  reason 
for  the  ineffectiveness  of  direct  selection  is  probably  that  the  true  sex 
ratio  of  a  family  is  subject  to  a  large  error  of  estimation  resulting 
from  the  sampling  variation,  and  the  heritability  is  consequently  very 
low. 

2.  If  the  desired  character  is  measurable  in  one  sex  only,  but  the 
secondary  character  is  measurable  in  both,  then  a  higher  intensity  of 
selection  will  be  possible  by  indirect  selection.  Other  things  being 
equal,  the  intensity  of  selection  would  be  twice  as  great  by  indirect 
as  by  direct  selection;  but  a  better  plan  would  be  to  select  one  sex 
directly  for  the  desired  character  and  the  other  indirectly  for  the 
secondary  character. 

Though  indirect  selection  has  been  presented  above  as  an  alterna- 
tive to  direct  selection,  the  most  effective  method  in  theory  is  neither 
one  nor  the  other  but  a  combination  of  the  two.  The  most  effective  use 
that  can  be  made  of  a  correlated  character  is  in  combination  with  the 
desired  character,  as  an  additional  source  of  information  about  the 
breeding  values  of  individuals.   This,  however,  is  a  special  case  of  a 
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more  general  problem  which  will  be  dealt  with  in  the  final  section  of 
this  chapter.  First  we  shall  show  how  the  idea  of  indirect  selection 
can  be  extended  to  cover  selection  in  different  environments. 


Genotype-Environment  Interaction 

The  concept  of  genetic  correlation  can  be  applied  to  the  solution 
of  some  problems  connected  with  the  interaction  of  genotype  with 
environment.  The  meaning  of  interaction  between  genotype  and 
environment  was  explained  in  Chapter  8,  where  it  was  discussed  as  a 
source  of  variation  of  phenotypic  values,  which  in  most  analyses  is 
inseparable  from  the  environmental  variance.  The  chief  problem 
which  it  raises  and  which  we  are  now  in  a  position  to  discuss  concerns 
adaptation  to  local  conditons.  The  existence  of  genotype-environ- 
ment interaction  may  mean  that  the  best  genotype  in  one  environ- 
ment is  not  the  best  in  another  environment.  It  is  obvious,  for 
example,  that  the  breed  of  cattle  with  the  highest  milk-yield  in 
temperate  climates  is  unlikely  also  to  have  the  highest  yield  in  tropical 
climates.  But  it  is  not  so  obvious  whether  smaller  differences  of  en- 
vironmental conditions  also  require  locally  adapted  breeds;  nor  is  it 
intuitively  obvious  how  much  of  the  improvement  made  in  one 
environment  will  be  carried  over  if  the  breed  is  then  transferred  to 
another  environment.  These  matters  have  an  important  bearing  on 
breeding  policy.  If  selection  is  made  under  good  conditions  of  feeding 
and  management  on  the  best  farms  and  experimental  stations,  will 
the  improvement  achieved  be  carried  over  when  the  later  generations 
are  transferred  to  poorer  conditions?  Or  would  the  selection  be 
better  done  in  the  poorer  conditions  under  which  the  majority  of 
animals  are  required  to  live  ?  The  idea  of  genetic  correlation  provides 
the  basis  for  a  solution  of  these  problems  in  the  following  way. 

A  character  measured  in  two  different  environments  is  to  be 
regarded  not  as  one  character  but  as  two.  The  physiological  mechan- 
isms are  to  some  extent  different,  and  consequently  the  genes  re- 
quired for  high  performance  are  to  some  extent  also  different.  For 
example,  growth  rate  on  a  low  plane  of  nutrition  may  be  principally 
a  matter  of  efficiency  of  food-utilisation,  whereas  on  a  high  plane  of 
nutrition  it  may  be  principally  a  matter  of  appetite.  By  regarding 
performance  in  different  environments  as  different  characters  with 
genetic  correlation  between  them  we  can  in  principle  solve  the  prob- 
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lems  outlined  above  from  a  knowledge  of  the  heritabilities  of  the 
different  characters  and  the  genetic  correlations  between  them 
(Falconer,  1952).  If  the  genetic  correlation  is  high,  then  performance 
in  two  different  environments  represents  very  nearly  the  same 
character,  determined  by  very  nearly  the  same  set  of  genes.  If  it  is 
low,  then  the  characters  are  to  a  great  extent  different,  and  high 
performance  requires  a  different  set  of  genes.  Here  we  shall  con- 
sider only  two  environments,  but  the  idea  can  be  extended  to  an 
indefinite     number     of    different     environments     (A.     Robertson, 

Let  us  consider  the  problem  of  the  ' 'carry-over"  of  the  improve- 
ment from  one  environment  to  another.  Let  us  suppose  that  we 
select  for  character  X — say  growth  rate  on  a  high  plane  of  nutrition — 
and  we  look  for  improvement  in  character  Y — say  growth  rate  on  a 
low  plane  of  nutrition.  The  improvement  of  character  Y  is  simply  a 
correlated  response  and  the  expected  rate  of  improvement  was  given 
in  equation  J9.5  as 

CRY  —  thjJtyTAVp? 

The  improvement  of  performance  in  an  environment  different  from 
the  one  in  which  selection  was  carried  out  can  therefore  be  predicted 
from  a  knowledge  of  the  heritability  of  performance  in  each  environ- 
ment and  the  genetic  correlation  between  the  two  performances.  We 
can  also  compare  the  improvement  expected  by  this  means  with  that 
expected  if  we  had  selected  directly  for  character  Y,  i.e.  for  perfor- 
mance in  the  environment  for  which  improvement  is  wanted.  This 
is  simply  a  comparison  of  indirect  with  direct  selection,  which  was 
explained  in  the  previous  section.  The  comparison  is  made  from  the 
ratio  of  the  two  expected  responses  given  in  equation  ig.7,  i.e. 


Rv 


rA  ~T~ 

iYhY 


This  shows  how  much  we  may  expect  to  gain  or  lose  by  carrying  out 
the  selection  in  some  environment  other  than  the  one  in  which  the 
improved  population  is  required  to  live.  If  we  assume  that  the  in- 
tensity of  selection  is  not  affected  by  the  environment  in  which  the 
selection  is  carried  out,  then  the  indirect  method  will  be  better  if 
rA^x  is  greater  than  hY,  where  hx  is  the  square  root  of  the  heritability 
in  the  environment  in  which  selection  is  made,  and  hY  is  the  square 
root  of  the  heritability  in  the  environment  in  which  the  population  is 
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required  subsequently  to  live.  If  the  genetic  correlation  is  high,  then 
the  two  characters  can  be  regarded  as  being  substantially  the  same; 
and  if  there  are  no  special  circumstances  affecting  the  heritability  or 
the  intensity  of  selection  it  will  make  little  difference  in  which  en- 
vironment the  selection  is  carried  out.  But  if  the  genetic  correlation 
is  low,  then  it  will  be  advantageous  to  carry  out  the  selection  in  the 
environment  in  which  the  population  is  destined  to  live,  unless  the 
heritability  or  the  intensity  of  selection  in  the  other  environment  is 
very  considerably  higher. 

This  is  the  theoretical  basis  for  dealing  with  selection  in  different 
environments.  So  far,  however,  there  has  been  little  experimental 
work  to  substantiate  the  theory.  The  results  of  the  experiments  that 
have  been  carried  out  do  not  appear  to  be  fully  in  agreement  with 
theoretical  expectations,  and  this  suggests  that  other  factors  not  yet 
understood  are  probably  operating.  (See  Falconer  and  Latyszewski, 
1952;  Falconer,  1952.) 


Simultaneous  Selection  for  more  than  one  Character 

When  selection  is  applied  to  the  improvement  of  the  economic 
value  of  animals  or  plants  it  is  generally  applied  to  several  characters 
simultaneously  and  not  just  to  one,  because  economic  value  depends 
on  more  than  one  character.  For  example,  the  profit  made  from  a 
herd  of  pigs  depends  on  their  fertility,  mothering  ability,  growth 
rate,  efficiency  of  food-utilisation,  and  carcass  qualities.  How,  then, 
should  selection  be  applied  to  the  component  characters  in  order  to 
achieve  the  maximum  improvement  of  economic  value?  There  are 
several  possible  procedures.  One  might  select  in  turn  for  each 
character  singly  ("tandem"  selection);  or  one  might  select  for  all  the 
characters  at  the  same  time  but  independently,  rejecting  all  individuals 
that  fail  to  come  up  to  a  certain  standard  for  each  character  regardless 
of  their  values  for  any  other  of  the  characters  ("independent  culling 
levels").  It  has  been  shown,  however,  that  the  most  rapid  improve 
ment  of  economic  value  is  expected  from  selection  applied  simul 
taneously  to  all  the  component  characters  together,  appropriate  weight 
being  given  to  each  character  according  to  its  relative  economic 
importance,  its  heritability  and  the  genetic  and  phenotypic  correla- 
tions between  the  different  characters,  (Hazel  and  Lush,  1942; 
Hazel,  1943).   The  practice  of  selection  for  economic  value  is  thus  a 
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matter  of  some  complexity.  The  component  characters  have  to  be 
combined  together  into  a  score,  or  index,  in  such  a  way  that  selection 
applied  to  the  index,  as  if  the  index  were  a  single  character,  will  yield 
the  most  rapid  possible  improvement  of  economic  value.  If  the 
characters  are  uncorrelated  there  is  no  great  problem:  each  character 
is  weighted  by  the  product  of  its  relative  economic  value  and  its 
heritability.  This  is  the  best  that  can  be  done  in  the  absence  of 
information  about  the  genetic  correlations,  but  if  the  genetic  correla- 
tions are  known  the  efficiency  of  the  index  can  be  improved.  The 
following  account  gives  an  outline  of  the  principles  on  which  the 
construction  of  a  selection  index  is  based.  For  a  fuller  account  the 
reader  should  consult  Lerner  (1950)  and  the  original  papers  of 
Fairfield  Smith  (1936)  and  Hazel  (1943). 

For  the  sake  of  simplicity  we  shall  consider  only  two  component 
characters  of  economic  value,  but  the  conclusions  can  readily  be  ex- 
tended to  any  number  of  characters.  Let  the  economic  value  be 
determined  by  two  characters  X  and  Y,  and  let  w  be  the  additional 
profit  expected  from  one  unit  increase  of  Y  relative  to  that  from  one 
unit  increase  of  X.  The  aim  of  selection  therefore  is  to  pick  out 
individuals  with  the  highest  values  of  (Ax  +  wAY),  where  Ax  and  A Y 
are  the  breeding  values  of  the  two  characters  X  and  Y.  Let  us  call 
this  compound  breeding  value  "merit,"  with  the  symbol  H,  so  that 


H  =  Ax  +  wA 


(i9.8) 


The  problem  is  to  find  out  how  the  phenotypic  values,  Px  and  PY,  of 
the  two  component  characters  are  to  be  combined  into  an  index  that 
gives  the  best  estimate  of  an  individual's  merit,  H.  In  Chapter  10  we 
saw  how  the  best  estimate  of  the  breeding  value  of  an  individual  for 
one  character  is  the  regression  equation  A  =bAPP,  where  bAP  is  the 
regression  of  breeding  value  on  phenotypic  value,  and  is  equal  to  the 
heritability  (see  p.  166).  The  present  problem  is  essentially  the  same, 
only  now  we  have  to  use  partial  regression  coefficients.  The  multiple 
regression  equation  giving  the  best  estimate  of  merit  is 

H=bH^YPx+bHY,xPY  (J9.9) 


where  Px  and  PY  are  phenotypic  values  measured  as  deviations  from 
the  population  mean.  (In  this  formula,  and  in  those  that  follow,  the 
symbol  X  has  the  same  meaning  as  Px,  i.e.  the  phenotypic  value  of 
character  X;  and  similarly  Y  and  PY  both  mean  the  phenotypic  value 
of  character  Y.   Thus,  bHX,  Y  is  the  regression  of  merit  on  the  pheno- 
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typic  value  of  X  when  the  phenotypic  value  of  Y  is  held  constant,  and 
^hy.x  nas  a  similar  meaning  with  X  and  Y  interchanged.)  In  practice 
it  is  convenient  to  have  the  index  in  a  form  that  requires  the  manipu- 
lation of  only  one  of  the  phenotypic  values,  i.e.  in  the  form 

I=PX  +  WPY  (ig.io) 

where  /  is  the  index  by  means  of  which  individuals  are  to  be  chosen, 
and  W  is  a  factor  by  which  the  phenotypic  value  of  character  Y  is 
to  be  multiplied.  Since  the  absolute  magnitude  of  the  index  is  of  no 
importance,  but  only  its  relative  magnitude  in  different  individuals, 
we  can  work  with  the  phenotypic  values  as  they  stand  instead  of  with 
deviations  from  the  population  mean.  And  we  can  put  equation 
ig.g  into  the  form  of  equation  ig.io  simply  by  dividing  through  by 
t>HX.Y-  Then  W  in  equation  ig.io  is  the  ratio  of  the  two  partial 
regression  coefficients, 


W= 


'HY.X 
}HX.Y 


and  our  task  now  is  to  find  a  way  of  expressing  W  in  terms  of  the 
genetic  properties  of  the  two  characters. 

First  let  us  put  the  partial  regression  coefficients  in  terms  of  the 
total  regression  coefficients.  For  example, 


bnv-bnYb 


'HY 


HXUXY 


Therefore 


W- 


'HY.X  — 


'HY.X 


'HY 


rXY 


UHxbxY 


'HX.Y 


'HX 


^HY^YX 


Now  let  us  express  these  total  regressions  in  terms  of  covariances  and 
variances.  For  example,  bHY  =  covHY/aY.  After  some  simplification 
the  expression  reduces  to 


W= 


gxcovhy  -  covEXcovXY 

OYC0VHX  -  COVHYCOVXY 


.{19.11) 


The  variances  ox  and  <jy  here,  and  in  what  follows,  are  the  pheno- 
typic variances  of  characters  X  and  Y.  The  covariances  in  the  above 
expression  can  be  expressed  in  terms  of  the'phenotypic  variance  anc 
the  heritability  of  each  character  and  of  the  phenotypic  and  genetic 
correlations  between  the  two  characters,  all  of  which  quantities  car 
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be  estimated.   Take,  for  example,  the  covariance  of  H  with  X.  This 
|  may  be  written  as  follows: 

covHX  =  covariance  of  (Ax  +  wAY)  with  Px 
=covUx.px)+coviwAYwpx) 
=  h\ox  +  wrAhxoxhYoY 

J  In  this  way  the  covariances  in  equation  19.11  can  be  expressed  as 
|  follows,  a  andcr2  being  phenotypic  standard  deviations  and  variances 
i !  throughout: 

covHX  =  hxcrx  +  wrAhxhYax<jY  ^ 

covHY  =  wh\a\  +  rAhxhYuxaY  > ( 19.12) 

covXY  =  rP<jxvY  J 

The  procedure  for  selection  is  thus  to  compute  the  covariances 

I  given  in  19.12,  substitute  them  in  19.11  and  use  the  value  of  W  so 

obtained  to  compute  the  index  of  selection  /  given  in  19.10.    The 

value  of  the  index  for  each  individual  then  forms  the  basis  of  selection. 

The  index  as  formulated  above  is  applicable  only  to  individual 
selection.  If  family  selection  is  applied  then  the  heritabilities  and 
correlations  that  go  into  the  index  must  be  those  appropriate  to  the 
family  means.  Family  selection,  however,  is  not  greatly  improved  by 
the  use  of  an  index,  because  the  family  heritabilities  of  the  component 
characters  are  generally  fairly  high  and  the  mean  economic  value  of  a 
family  in  terms  of  phenotypic  values  is  not  very  different  from  its 
merit  in  terms  of  breeding  values.  Therefore  family  selection  for 
economic  value  can  be  applied  with  little  loss  of  efficiency  if  the 
phenotypic  values  are  weighted  only  by  w,  the  relative  economic  im- 
portance of  each  component  character. 

The  complexity  of  selection  by  means  of  an  index  need  hardly  be 
emphasised,  especially  when  the  index  is  extended  to  cover  many 
component  characters.  Even  with  two  characters,  estimates  of  no 
fewer  than  seven  quantities  are  required  for  the  construction  of  the 
index.  Since  some  of  these,  particularly  the  genetic  correlation, 
cannot  usually  be  estimated  with  any  great  precision,  the  index 
cannot  be  regarded  as  much  more  than  a  rough  guide  to  procedure. 
But  since  selection  has  to  be  applied  to  economic  value  by  some 
means,  it  seems  better  to  use  a  selection  index,  however  imprecise, 
i  than  to  base  selection  on  a  purely  arbitrary  combination  of  com- 
ponent characters. 

Use  of  a  secondary  character  by  means  of  an  index.    The 
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selection  index  described  above  can  readily  be  adapted  to  meet  the 
case  where  improvement  of  only  one  character  is  sought,  the  other 
character  being  used  merely  as  an  aid  to  more  efficient  selection. 
The  use  of  a  secondary  character  in  this  way  was  mentioned  earlier, 
in  connexion  with  indirect  selection.  Let  X  be  the  character  it  is 
desired  to  improve,  and  Y  the  secondary  character.  Then  the 
relative  "economic"  value  of  character  Y  is  zero,  and  we  can  substitute 
w  =  o  in  the  formulae  of  ig.12.  Substitution  of  the  covariances  in 
equation  ig.n  then  yields  a  formula  which  on  simplification  reduces 
to 

w^ArAhY-rPh)  

oY(hx-rAhY)  K  y    J) 

The  selection  index  of  equation  ig.io  is  then  used  with  this  value  of 
W.  The  value  of  W  in  the  index  may  be  negative.  This  will  arise  if 
the  phenotypic  correlation  between  the  two  characters  is  chiefly 
environmental  in  origin.  The  secondary  character  then  acts  as  an 
indicator  of  the  environmental  deviation  rather  than  of  the  breeding 
value  of  the  desired  character  (see  Rendel,    1954;   and  Osborne, 

Genetic  correlation  and  the  selection  limit.  There  is  one 
important  consequence  of  simultaneous  selection  for  several  charac- 
ters to  be  discussed  before  we  leave  the  subject.  Just  as  the  herit- 
abilities  are  expected  to  change  after  selection  has  been  applied  for 
some  time,  so  also  are  the  genetic  correlations.  If  selection  has  been 
applied  to  two  characters  simultaneously  the  genetic  correlation 
between  them  is  expected  eventually  to  become  negative,  for  the 
following  reason.  Those  pleiotropic  genes  that  affect  both  characters 
in  the  desired  direction  will  be  strongly  acted  on  by  selection  and 
brought  rapidly  toward  fixation.  They  will  then  contribute  little  to 
the  variances  or  to  the  covariance  of  the  two  characters.  The  pleio- 
tropic genes  that  affect  one  character  favourably  and  the  other  ad- 
versely will,  however,  be  much  less  strongly  influenced  by  selection 
and  will  remain  for  longer  at  intermediate  frequencies.  Most  of  the 
remaining  covariance  of  the  two  characters  will  therefore  be  due  to 
these  genes,  and  the  resulting  genetic  correlation  will  be  negative. 
The  consequence  of  a  negative  genetic  correlation,  whether  produced 
by  selection  in  this  way  or  present  from  the  beginning,  is  that  the  two 
characters  may  each  show  a  heritability  that  is  far  from  zero,  and  yet 
when  selection  is  applied  to  them  simultaneously  neither  responds. 
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We  have  already  discussed,  in  Chapter  12,  what  is  essentially  the 
same  situation  resulting  from  the  combined  effects  of  artificial  and 
natural  selection:  a  selection  limit  is  reached  even  though  the  charac- 
ter to  which  artificial  selection  is  applied  still  shows  a  substantial 
amount  of  additive  genetic  variance. 

Example  19.2.  A  practical  example  from  a  commercial  flock  of  poultry 
is  described  by  Dickerson  (1955).  Selection  for  economic  value  had  been 
applied  for  many  years,  but  recent  progress  in  the  component  characters 
was  much  less  than  was  to  be  expected  from  their  heritabilities,  which  were 
found  to  be  moderately  high.  Estimations  of  the  genetic  correlations 
between  the  component  characters  showed  that  many  of  these  were  nega- 
tive. To  take  just  one  example,  the  relationships  between  egg-production 
and  egg  weight  were  as  follows: 


X 

Production 


Y 

Weight 


hi 


h\ 


0-32       0-59 


0-04    -0-39 


+  0-25 


In  spite  of  the  high  heritabilities  neither  character  had  shown  any  improve- 
ment over  the  last  10-15  years.  The  high  negative  genetic  correlation 
would  account  for  this  failure  to  respond,  if  selection  was  applied  to  both 
characters  simultaneously.  It  is  interesting  to  note  that  environmental 
variation,  unlike  genetic  variation,  affects  both  characters  in  the  same  way 
and  leads  to  a  positive  environmental  correlation.  The  phenotypic  cor- 
relation, which  is  almost  zero,  gives  no  clue  to  the  genetic  relationship 
between  the  two  characters,  and  the  failure  to  respond  to  selection  could 
mot  have  been  predicted  from  it  alone. 

A  population  which  has  been  subjected  over  a  long  period  to 
selection  for  economic  value  throws  light  on  the  genetic  properties  to 
be  expected  in  natural  populations  subject  to  natural  selection  for 
fitness.  Fitness  is  a  compound  character  with  many  components — 
far  more  than  would  appear  in  the  most  elaborate  assessment  of 
economic  value — and  so  we  should  expect  negative  genetic  correla- 
tions between  its  major  components,  a  conclusion  to  be  developed 
further  in  the  next  chapter.  It  is  interesting  to  note,  however,  that 
natural  selection  takes  no  account  of  heritabilities  or  genetic  correla- 
tions, and  is  therefore,  in  theory,  less  efficient  in  improving  fitness 
than  artificial  selection  by  means  of  an  index  is  in  improving  economic 
value. 


CHAPTER   20 

METRIC  CHARACTERS  UNDER 
NATURAL  SELECTION 

Throughout  the  discussion  of  the  genetic  properties  of  metric 
characters,  which  has  occupied  the  major  part  of  the  book,  very  little 
attention  has  been  given  to  the  effects  of  natural  selection,  and  some- 
thing must  now  be  done  to  remedy  this  omission.  The  absence  of 
differential  viability  and  fertility  was  specified  as  a  condition  in  the 
theoretical  development  of  the  subject:  that  is  to  say,  natural  selection 
was  assumed  to  be  absent.  Though  for  many  purposes  this  assump- 
tion may  lead  to  no  serious  error,  a  complete  understanding  of  metric 
characters  will  not  be  reached  until  the  effects  of  natural  selection 
can  be  brought  into  the  picture.  The  operation  of  natural  selection 
on  metric  characters  has,  however,  a  much  wider  interest  than  just 
as  a  complication  that  may  disturb  the  simple  theoretical  picture  and 
the  predictions  based  on  it.  It  is  to  natural  selection  that  we  must  look 
for  an  explanation  of  the  genetic  properties  of  metric  characters  which 
hitherto  we  have  accepted  with  little  comment.  The  genetic  pro- 
perties of  a  population  are  the  product  of  natural  selection  in  the  past, 
together  with  mutation  and  random  drift.  It  is  by  these  processes 
that  we  must  account  for  the  existence  of  genetic  variability;  and  it  is 
chiefly  by  natural  selection  that  we  must  account  for  the  fact  that 
characters  differ  in  their  genetic  properties,  some  having  propor- 
tionately more  additive  variance  than  others,  some  showing  in- 
breeding depression  while  others  do  not.  These,  however,  are  very 
wide  problems  which  are  still  far  from  solution,  and  in  this  con- 
cluding chapter  we  can  do  little  more  than  indicate  their  nature.  Any 
discussion  of  them,  moreover,  cannot  but  be  controversial;  the  reader 
should  therefore  understand  that  the  contents  of  this  chapter  are  to  a 
large  extent  matters  of  personal  opinion,  and  that  any  conclusions  to 
which  the  discussion  may  lead  are  open  to  dispute. 

We  shall  refer  throughout  to  a  population  that  is  in  genetic  equil- 
ibrium. Being  in  genetic  equilibrium  means  that  the  gene  frequencies 
are  not  changing,  and  therefore  that  the  mean  values  of  all  metric 
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characters  are  constant.  (Changes  of  environmental  conditions  are 
assumed  to  be  so  slow  as  to  be  negligible.)  The  population  is  con- 
stantly subject  to  natural  selection  tending  to  increase  fitness,  but 
despite  the  selection  the  gene  frequencies  do  not  change  and  fitness 
does  not  improve.  There  can  therefore  be  no  additive  genetic  vari- 
ance of  fitness:  in  other  words,  if  we  could  measure  fitness  itself  as  a 
character  we  should  find  that  its  genetic  variance  was  entirely  non- 
additive.  For  the  purposes  of  discussion  we  may  regard  any  natural 
population  as  being  in  genetic  equilibrium,  at  least  approximately, 
and  also  any  population  that  has  been  subject  to  artificial  selection 
consistently  over  a  long  period  of  time,  provided  that  fitness  is 
defined  in  terms  of  both  the  artificial  and  the  natural  selection. 
Fitness,  crudely  defined,  is  the  "character"  selected  for,  whether  by 
natural  selection  alone  or  by  artificial  and  natural  selection  com- 
bined. 

If  a  population  is  in  genetic  equilibrium  it  follows  that  a  reduction 
of  fitness  must  in  principle  result  from  any  change  in  the  array  of  gene 
frequencies,  apart  from  any  genes  that  may  have  no  effect  on  fitness. 
Natural  selection  must  therefore  be  expected  to  resist  any  tendency 
to  change  of  the  gene  frequencies,  such  as  must  result  from  artificial 
selection  applied  to  any  metric  character  other  than  fitness  itself. 
This  principle  has  been  called  "genetic  homeostasis,"  and  its  conse- 
quences have  been  discussed,  by  Lerner  (1954).  Thus  if  we  change 
any  metric  character  by  artificial  selection  we  must  expect  a  reduction 
of  fitness  as  a  correlated  response.  And  if  we  then  suspend  the  arti- 
ficial selection  before  any  of  the  variation  has  been  lost  by  fixation, 
we  must  expect  the  population  mean  to  revert  to  its  original  value. 
On  the  whole,  the  experience  of  artificial  selection  is  in  general 
agreement  with  this  expectation,  though  under  laboratory  conditions 
the  reduction  of  fitness  may  not  be  apparent  in  the  early  stages,  and 
some  characters  appear  to  revert  very  slowly,  if  at  all,  toward  the 
original  value.  Our  domesticated  animals  and  plants  are  perhaps  the 
best  demonstration  of  the  effects  of  the  principle.  The  improvements 
that  have  been  made  by  selection  in  these  have  clearly  been  accom- 
panied by  a  reduction  of  fitness  for  life  under  natural  conditions,  and 
only  the  fact  that  domestic  animals  and  plants  do  not  have  to  live 
under  natural  conditions  has  allowed  these  improvements  to  be 
made.  The  problems  for  discussion  in  this  chapter  must  be  seen 
against  the  background  of  this  principle:  that  the  existing  array  of 
gene  frequencies,  and  consequently  the  existing  genetic  properties  of 
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the  population,  represent  the  best  total  adjustment  to  existing  con- 
ditions that  is  possible  with  the  available  genetic  variation. 

The  problem  of  how  natural  selection  operates  on  metric  charac- 
ters has  two  aspects:  the  relation  between  any  particular  metric 
character  and  fitness,  and  the  way  in  which  natural  selection  operates 
on  the  individual  loci  concerned  with  a  metric  character.  This  latter 
aspect  is  part  of  a  wider  problem  which  concerns  the  reasons  for  the 
existence  of  genetic  variation.  We  shall  discuss  these  two  aspects 
separately,  because  any  conclusions  that  may  be  drawn  about  the 
second  will  depend  on  what  can  be  discovered  about  the  first. 


Relation  of  Metric  Characters  to  Fitness 

The  fitness  of  an  individual  is  the  final  outcome  of  all  its  develop- 
mental and  physiological  processes.  The  differences  between  indi- 
viduals in  these  processes  are  seen  in  variation  of  the  measurable 
attributes  which  can  be  studied  as  metric  characters.  Thus  the 
variation  of  each  metric  character  reflects  to  a  greater  or  lesser  degree 
the  variation  of  fitness;  and  the  variation  of  fitness  can  theoretically 
be  broken  down  into  variation  of  metric  characters.  Let  us  consider 
for  example  a  mammal  such  as  the  mouse,  because  this  matter  is  more 
easily  discussed  in  concrete  terms.  Fitness  itself  might  be  broken 
down  into  two  or  three  major  components,  which  could  be  measured 
and  studied  as  metric  characters.  These  might  be  the  total  number 
of  young  reared,  and  some  measure  of  the  quality  of  the  young,  such 
as  their  weaning  weight.  The  variation  of  the  major  components 
would  account  for  all  the  variation  of  fitness.  Each  of  the  major  com- 
ponents might  be  broken  down  into  other  metric  characters  which 
would  account  for  all  their  variation.  Thus  the  total  number  of 
young  weaned  depends  on  the  viability  of  the  parent  up  to  breeding 
age,  its  mating  ability,  average  litter  size,  frequency  of  litters,  and 
longevity.  These  characters  in  turn  might  be  further  broken  down. 
For  example,  litter  size  depends  on  the  number  of  eggs  shed  and  the 
proportion  that  are  brought  to  term.  The  number  of  eggs  shed 
depends,  again,  on  body  size  and  endocrine  activity,  among  other 
things.  Thus  each  metric  character  has  its  place  in  one  of  a  series  of 
chains  of  causation  converging  toward  fitness.  And  these  chains  of 
causation  interconnect  one  with  another:  body  size,  for  example, 
influences  not  only  litter  size,  but  also  lactation,  longevity,  and  prob- 
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ably  many  other  characters.  The  relationship  between  any  particular 
metric  character  and  fitness  is  thus  a  very  complicated  matter.  The 
following  discussion  of  the  problem  is  based  largely  on  the  ideas  put 
forward  by  A.  Robertson  (19556). 

The  way  in  which  natural  selection  operates  on  a  character 
depends  on  the  part  played  by  the  character  in  the  causation  of  differ- 
ences of  fitness:  that  is  to  say,  on  the  manner  and  degree  by  which 
differences  of  value  of  the  metric  character  cause  differences  of  fitness. 
This  we  shall  refer  to  as  the  "functional  relationship"  between  the 
character  and  fitness.  The  functional  relationship  expresses  the 
mode  of  operation  of  natural  selection  on  the  metric  character;  but  it 
is  not  necessarily  also  the  relationship  that  would  be  revealed  if  we 
could  measure  the  fitness  of  individuals  and  compare  their  fitness  with 
their  values  for  the  metric  character.  This  point,  however,  will  be 
more  easily  explained  by  an  example  to  be  given  in  a  moment. 
Different  characters  must  be  expected  to  have  different  functional 
relationships  with  fitness,  according  to  the  nature  of  the  character. 
In  explanation  of  the  kinds  of  relationship  that  may  be  envisaged  let 
us  take  some  examples  of  different  sorts  of  character  at  different 
positions  in  the  chain  of  causation. 

1.  Neutral  characters.  There  may  be  some  characters  that  have 
no  functional  relationship  at  all  with  fitness.  This  does  not  mean 
that,  like  vestigial  organs,  they  have  no  function  or  use.  It  means  that 
the  variation  in  the  character  is  not  a  cause  of  variation  of  fitness. 
Abdominal  bristle  number  in  Drosophila  may  be  taken  as  an  example 
of  a  character  which  is  probably  not  far  from  this  state,  and  two 
reasons  can  be  given  for  regarding  it  thus.  First,  it  is  difficult  to 
conceive  of  any  biological  reason  why  it  should  be  important  to  have 
18  bristles,  or  thereabouts,  on  each  segment  rather  than  more  or 
fewer.  And  second,  if  we  change  the  bristle  number  by  artificial 
selection  and  then  suspend  the  selection,  the  mean  bristle  number 
does  not  return  to  its  original  value — or  returns  only  very  slowly — 
under  the  influence  of  natural  selection,  even  though  it  could  be 
brought  back  rapidly  by  artificial  selection  (Clayton,  Morris,  and 
Robertson,  1957).  In  other  words,  genetic  homeostasis  in  respect  of 
bristle  number  is  weak  or  non-existent.  Such  a  metric  character  may 
be  termed  "neutral"  with  respect  to  fitness.  The  mean  value  of  a 
neutral  character  in  the  population  has  little  or  nothing  to  do  with  the 
character  itself,  but  is  the  outcome  of  the  pleiotropic  effects  of  the 
genes  whose  frequencies  are  controlled  by  their  effects  on  other 


334  METRIC  CHARACTERS  UNDER  NATURAL  SELECTION     [Chap.  20 

characters.  Though  a  neutral  character  has  no  functional  relationship 
with  fitness,  we  may  nevertheless  find  that  individuals  with  different 
values  do  in  fact  differ  in  fitness  in  a  regular  way.  If  the  genetic 
variance  of  the  character  is  predominantly  additive  then  individuals 
with  intermediate  values  will  tend  on  the  whole  to  be  heterozygous  at 
more  loci  than  individuals  with  extreme  values.  Then  if  hetero- 
zygotes  were  superior  in  fitness  for  some  other  reason,  unconnected 
with  the  character  in  question,  this  would  result  in  intermediates 
being  superior  in  fitness.  At  the  level  of  observation  there  would  be 
a  relationship  between  values  of  the  character  and  fitness,  but  this 
would  not  be  a  functional  relationship  because  the  values  of  the 
character  are  not  the  cause  of  the  differences  of  fitness.  The  differ- 
ences of  fitness  are  the  result  of  the  functional  relationships  of  other 
characters  affected  by  the  pleiotropic  action  of  the  genes. 

2.  Characters  with  intermediate  optima.  There  are  some 
characters  for  which  an  intermediate  value  is  optimal  for  functional 
reasons.  One  might  distinguish  three  sorts  of  intermediate  opti- 
mum according  to  the  reasons  for  intermediates  being  superior  in 
fitness. 

(i)  Optima  determined  by  the  character  itself.  As  an  example  we 
might  take  any  character  that  measures  the  thermal  insulation  of  a 
mammalian  coat.  Too  dense  a  coat  would  be  disadvantageous  and  so 
would  too  sparse  a  coat.  An  intermediate  density  would  confer  the 
highest  fitness  as  a  consequence  of  the  function  of  the  coat  in  thermo- 
regulation. For  such  a  character  the  mean  value  in  the  population  is 
the  optimal  value,  provided  there  are  no  complications  of  the  sort  to 
be  considered  later.  Though  irrefutable  biological  reasons  might  be 
given  for  supposing  that  a  character  such  as  the  density  of  fur  has  an 
intermediate  optimal  value,  we  might  nevertheless  find  that  over  the 
range  of  variation  covered  by  the  population  there  was  very  little 
variation  in  fitness.  In  practice  therefore  one  could  not  expect  always 
to  draw  a  clear  line  between  this  sort  of  character  and  a  neutral 
character  such  as  we  have  taken  bristle  number  to  be. 

(ii)  Optima  imposed  by  the  environment.  As  an  example  we  may 
take  the  clutch  size  of  birds.  It  has  been  shown,  particularly  for  the 
European  robin  and  swift,  that  a  larger  number  of  young  are  reared 
from  nests  containing  the  average  number  of  eggs  than  from  nests 
with  larger  or  smaller  clutches  (Lack,  1954).  Thus  individuals  with 
intermediate  values  appear  to  be  the  fittest.  If  a  character  such  as 
this  has  an  optimal  value  that  is  intermediate  there  must  obviously 
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be  some  other  factor  interacting  with  it  to  determine  fitness;  for, 
otherwise,  the  individuals  that  lay  more  eggs  must  inevitably  be  the 
fitter.  The  other  factor  in  this  case  is  the  supply  of  insects  for  feeding 
the  young  and  the  length  of  daylight  available  for  their  capture. 
With  characters  of  this  sort  natural  selection  tends  to  eliminate  indi- 
viduals with  extreme  values  and  favours  individuals  with  intermediate 
values.  The  mean  value  in  the  population  is  the  optimal  value  under 
the  environmental  conditions  to  which  the  population  is  subjected. 
If  the  environment  were  to  change,  the  population  mean  would 
change  too  in  adjustment  to  the  new  optimum.  In  the  case  of  clutch 
size  it  is  noteworthy  that  the  mean  value  varies  with  the  latitude, 
being  larger  in  the  north  than  in  the  south. 

(iii)  Optima  imposed  by  a  correlated  character.  Body  size  in  mice 
may  be  taken  as  an  example.  Larger  mice  have  larger  litters  and, 
under  laboratory  conditions,  they  rear  more  young.  Therefore  if 
there  were  no  other  factor  involved,  larger  mice  would  be  fitter.  Since 
body  size  can,  as  we  have  seen,  be  readily  increased  by  artificial 
selection,  there  must  be  some  other  factor  that  prevents  its  being 
increased  by  natural  selection  in  the  wild.  The  other  factor  in  this  case 
is  probably  not  environmental,  but  another  character  negatively 
correlated  with  size,  namely  wildness.  A  change  of  body  size  under 
artificial  selection  is  always  accompanied  by  a  correlated  change  of 
wildness.  Large  mice  are  phlegmatic  and  unreactive  to  disturbance, 
whereas  small  mice  are  alert  and  react  energetically  to  disturbance 
(MacArthur,  1949;  Falconer,  1953).  Therefore  under  natural  con- 
ditions larger  mice  would  more  readily  fall  prey  to  cats  and  owls  than 
small  mice,  and  the  advantage  of  greater  fertility  would  be  offset  by 
the  disadvantage  of  being  less  well  fitted  to  escape  predators.  The 
body  size  of  wild  mice,  it  may  be  suggested,  represents  the  best 
compromise  between  these  two  correlated  characters.  If  we  could 
measure  the  relationship  between  size  and  fitness  in  wild  mice  we 
should  find  that  those  of  intermediate  size  were  fittest.  With  charac- 
ters of  this  sort  also,  the  population  mean  represents  the  optimal 
value.  But  this  value  is  optimal  not  because  of  this  character  itself  but 
because  of  its  genetic  correlations  with  other  characters.  Large  mice 
are  selected  against  not  because  they  are  large  but  because,  being 
large,  they  are  inevitably  also  less  wild.  This  example  brings  us  to  the 
point  mentioned  at  the  end  of  the  last  chapter:  that  we  must  expect  to 
find  negative  genetic  correlations  between  characters  under  simul- 
taneous selection.   In  this  case  we  find  a  negative  genetic  correlation 
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between  large  size  and  wildness,  both  of  which  may  reasonably  be 
supposed  to  be  favoured  by  natural  selection.  These  two  characters 
are  *  'components' '  of  fitness  in  the  same  way  that  characters  of  econ- 
omic importance  are  components  of  total  economic  merit.  What 
natural  selection  "aims  at"  is  to  increase  both  characters  indefinitely, 
but  the  physiological  connexions  between  them,  which  we  see  as  a 
negative  genetic  correlation,  limit  the  increase  that  is  possible  with 
the  existing  genetic  variability. 

3.  Major  components  of  fitness.  If  we  could  measure  fitness 
itself — which  is  technically  very  difficult — we  should  obviously  find 
no  "optimal"  value;  the  individuals  most  favoured  by  natural 
selection  would  not  be  those  nearest  to  the  population  mean,  but  the 
most  extreme.  In  spite  of  the  selection  toward  higher  values  the 
mean  fails  to  change  under  natural  selection  because  there  is  no 
additive  variance  of  fitness.  If  we  measure  as  a  metric  character 
something  that  is  a  major  component  of  fitness,  in  the  sense  that  it 
accounts  for  a  large  part  of  the  variation  of  fitness,  we  should  probably 
find  the  same  sort  of  relationship.  Fitness  would  increase  as  the 
value  of  the  character  increased.  At  the  very  highest  values,  however, 
fitness  would  probably  decline  again  slightly.  Egg-laying  in  Droso- 
phila  might  well  be  such  a  character,  even  if  measured  only  over  a 
few  days,  since  the  daily  egg  production  is  highly  correlated  with  the 
total  production  (Gowen,  1952).  We  should  almost  certainly  find 
that  the  fittest  individuals  were  not  those  that  laid  an  intermediate 
number  of  eggs,  but  those  that  laid  almost  the  most.  The  most  ex- 
treme individuals  would  probably  be  slightly  less  fit  because  of  some 
environmental  limitation  or  some  correlated  character,  perhaps 
longevity.  There  must  be  many  characters  whose  relationships  with 
fitness  fall  between  this  and  the  previous  type,  characters  with  an 
optimal  value  above  the  population  mean  but  yet  below  that  of  the 
most  extreme  individuals. 

The  foregoing  discussion  will  be  enough  to  explain  the  nature  of 
the  problem  of  the  relationship  between  a  metric  character  and  fitness 
and  to  indicate  the  sort  of  solution  that  may  be  sought.  Let  us  turn 
now  to  the  connexion  between  the  relationship  with  fitness  and  the 
nature  of  the  genetic  variation  of  a  metric  character.  When  we  first 
discussed  the  heritability  as  a  property  of  a  character  in  Chapter  10, 
we  noted  a  tendency  toward  lower  heritabilities  among  characters 
more  closely  connected  with  fitness.  But  the  precise  meaning  of  a 
"close  connexion"  with  fitness  was  not  explained.    It  may  now  be 
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suggested  that  the  meaning  of  a  close  connexion  with  fitness  may 
perhaps  be  seen  in  the  functional  relationships  discussed  above. 
Characters  with  the  closest  connexion  are  of  the  third  type  where  the 
population  mean  is  not  at  an  optimal  value;  characters  with  a  less 
close  connexion  are  nearer  to  the  second  type;  while  characters  with 
the  least  connexion  are  the  neutral  or  nearly  neutral  characters.   On 
the  whole  it  does  seem  that  characters  with  high  heritabilities  are  to 
be  found  among  the  first  type  and  characters  with  low  heritabilities 
among  the  third.   Differences  of  heritability  are,  however,  not  really 
relevant  here.  It  is  the  genetic  variance  with  which  we  are  concerned; 
and  the  differences  in  the  proportion  of  the  genotypic  variance  that 
is  additive,  that  we  want  to  account  for.   But  so  little  is  known  about 
how  the  genotypic  variance  is  partitioned  into  additive  and  non- 
additive  components  that  we  can  scarcely  begin  to  tackle  the  prob- 
lem.  Four  characters  of  Drosophila,  however,  seem  to  fit  the  picture 
fairly  well,  (see  Table  8.2).  For  bristle  number,  which  we  have  taken 
as  a  neutral  character  of  the  first  type,  85  per  cent  of  the  genotypic 
variance  is  additive.   Thorax  length,  which  might  perhaps  be  of  the 
second  type,  has  about  the  same  proportion.  For  ovary  size,  however, 
only  43  per  cent  of  the  genotypic  variance  is  additive,  and  this 
character  might  well  be  between  the  second  and  third  types.    For 
egg  laying,  which  we  have  taken  to  be  of  the  third  type,  the  propor- 
tion is  29  per  cent.    These  comparisons,  of  course,  cannot  be  given 
much  weight  because  in  fact  we  know  almost  nothing  of  the  func- 
tional relations  of  the  characters  with  fitness.    But  they  do  suggest 
that  the  solution  of  the  problem  of  why  characters  differ  in  their 
genetic  properties  may  lie  along  these  lines.  The  reaction  of  a  charac- 
ter to  inbreeding  seems  also  to  be  connected  with  the  proportion  of 
non-additive  genetic  variance,  those  with  most  non-additive  variance 
being  those  that  suffer  the  greatest  inbreeding  depression.    Some, 
perhaps  most,  of  the  non-additive  variance  must  be  attributed  to 
dominance.   Reasons  for  expecting  the  effects  of  genes  on  characters 
closely  connected  with  fitness  to  show  dominance,  while  the  effects  on 
characters  not  closely  connected  with  fitness  do  not,  have  been  put 
forward  by  A.  Robertson  (19556);  but  it  would  take  too  much  space 
here  to  summarise  the  argument.   There  we  must  leave  the  problem 
of  the  nature  of  the  genetic  variance  and  pass  on  to  the  second  aspect 
of  the  operation  of  natural  selection  on  metric  characters. 
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Maintenance  of  Genetic  Variation 

The  second  aspect  of  the  operation  of  natural  selection  on  metric 
characters — its  effects  on  the  individual  loci — is  part  of  a  wider 
problem,  which  concerns  the  mechanisms  by  which  genetic  variation 
is  maintained.  Almost  every  metric  character,  of  the  many  that  have 
been  studied  both  in  natural  populations  and  in  domesticated  animals 
and  plants,  exhibits  genetic  variation.  What  are  the  reasons  for  the 
existence  of  this  genetic  variation?  The  coexistence  in  a  population 
of  different  alleles  at  a  locus  is  governed  by  the  three  processes  of 
mutation,  random  drift,  and  selection.  Allelic  differences  originate 
by  mutation  and  are  extinguished  by  random  drift,  since  no  natural 
population  is  infinite  in  size.  Natural  selection  may  tend  to  eliminate 
the  differences  by  favouring  one  allele  over  all  others  at  a  locus;  or  it 
may  tend  to  perpetuate  the  differences  by  favouring  heterozygotes. 
Let  us  discuss  the  role  of  natural  selection  first  and  the  roles  of  muta- 
tion and  random  drift  later. 

Effects  of  selection  on  individual  loci.    The  way  in  which 
selection  operates  on  any  locus  depends  on  the  effects  that  the  differ- 
ent alleles  have  on  fitness  itself,  and  not  simply  on  their  effects  on  one 
particular  metric  character.    Therefore  the  functional  relations  be 
tween  characters  and  fitness,  which  were  discussed  above,  can  indi-j  w 
cate  the  action  of  selection  only  on  those  loci  which  affect  fitness  ■% 
through  the  character  in  question  and  not  through  any  pleiotropic 
effects  on  other  characters.    Let  us  consider  the  three  types  of 
character  in  turn. 

i.  Neutral  characters.  If  there  are  genes  whose  only  effects  are 
on  a  neutral  character,  then  selection  plays  no  part  in  the  existence  of 
allelic  differences  at  these  loci.  The  gene  frequencies  at  these  loci 
must  be  controlled  solely  by  mutation  and  random  drift. 

2.  Characters  with  intermediate  optima.  The  consequences  of 
selection  favouring  individuals  of  intermediate  value  have  been 
examined  from  different  aspects  by  Wright  (19350,  ^)>  Haldane 
(19540),  and  by  A.  Robertson  (1956)  who  reaches  the  following 
conclusions.  If  the  intermediate  optimum  is  the  result  of  the  func- 
tional relations  of  the  character  to  fitness,  and  the  optimum  is  deter 
mined  by  the  character  itself  or  by  the  environment,  then  selection 
will  tend  toward  fixation  at  all  the  loci  whose  only  influence  on  fitness 
is  through  the  character  in  question.  This  would  apply  to  characters 
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of  type  2  (i)  and  (ii)  described  above  and  exemplified  by  the  density 
of  mammalian  fur  and  by  clutch  size  in  birds.  Selection  will  thus 
tend  to  eliminate  rather  than  to  conserve  variability  arising  from  loci 
which  affect  fitness  only  through  such  characters.  The  rate  at  which 
the  gene  frequencies  are  expected  to  change  toward  fixation  is  very 
slow,  and  so  the  rate  at  which  variation  would  be  eliminated  is  also 
very  slow;  but  on  an  evolutionary  time-scale  it  would  not  be  negligible. 
Characters  of  type  2  (iii),  where  an  intermediate  optimum  is  deter- 
mined by  a  correlated  character,  have  not  yet  been  investigated  in 
this  connexion,  and  the  mode  of  operation  of  selection  on  loci  that 
affect  them  is  not  known. 

3.  Major  components  of  fitness.  The  essential  feature  of  a  major 
component  of  fitness  is  that  the  population  mean  is  not  at  the  opti- 
mum. But  we  cannot  deduce,  from  this  fact  alone,  how  selection 
operates  on  the  individual  loci.  If  the  genes  that  affect  these  charac- 
ters are  at  intermediate  frequencies,  it  seems  most  probable  that  they 
are  held  there  by  selection  favouring  heterozygotes,  because  it  seems 
hardly  possible  that  the  coefficients  of  selection  are  small  enough  to 
allow  mutation  alone  to  maintain  intermediate  frequencies.  We  do  not 
know,  however,  whether  these  genes  are  at  intermediate  frequencies. 
It  seems  quite  possible  that  a  considerable  portion  of  the  genetic 
variation  of  these  characters  is  due  to  genes  at  very  low  frequencies, 
where  they  are  maintained  by  the  balance  between  mutation  and 
selection  against  the  recessive  homozygotes.  Much  evidence,  how- 
ever, has  been  presented  by  Lerner  (1954)  in  support  of  the  view  that 
heterozygotes  in  general  are  superior  in  fitness;  and  Haldane  (19546) 
has  pointed  out  that  a  general  superiority  of  heterozygotes  is  a  very 
reasonable  expectation  from  biochemical  considerations  of  gene 
action.  Though  the  matter  is  not  yet  settled,  the  weight  of  evidence 
at  present  seems  to  point  to  superior  fitness  of  heterozygotes,  and 
consequently  to  natural  selection  favouring  heterozygotes  at  most  of 
the  loci  that  affect  fitness  through  its  major  components. 

There  are  three  other  ways  in  which  selection  may  influence 
genetic  variability,  to  be  discussed  before  we  leave  the  subject.  They 
are  all  subsidiary  to  the  main  effects  on  gene  frequencies  which  we 
have  been  discussing;  they  may  modify  these  main  effects,  but  they 
do  not  in  themselves  provide  a  sufficient  description  of  the  operation 
of  natural  selection. 

Variable  selection.  If  characters  have  optimal  values  these 
optima  are  likely  to  vary  from  time  to  time  and  from  season  to  season 


340  METRIC  CHARACTERS  UNDER  NATURAL  SELECTION     [Chap.  20 

according  to  the  environmental  conditions.  The  selection  pressures 
on  the  individual  genes  are  therefore  likely  to  change  from  generation 
to  generation.  The  consequence  of  variable  selection  coefficients  has 
been  shown  (Kimura,  1954)  to  be  a  tendency  toward  fixation — or 
more  strictly,  near-fixation — the  favoured  allele  being  the  one  that 
gives  the  highest  average  fitness.  In  this  aspect  selection  would 
therefore  tend  to  eliminate  variability.  The  optimal  values  are 
likely  to  vary  also  from  place  to  place  within  each  generation,  especi- 
ally if  different  genotypes  choose  different  environments  in  which  to 
live,  as  Waddington  (1957)  suggests.  This  form  of  variable  selection 
has  been  shown  to  be  capable  under  certain  conditions  of  maintaining 
stable  polymorphism,  as  was  mentioned  in  Chapter  2.  Its  effect  on 
the  variation  of  metric  characters,  however,  has  not  been  examined. 
It  does  not  seem  likely  to  be  very  great. 

Balanced  linkage.  Mather's  theory  of  "polygenic  balance"  is 
based  on  the  idea  of  selection  favouring  intermediate  values  of  metric 
characters  and  the  effect  this  is  likely  to  have  on  linkage  (see  for 
example,  Mather,  1949,  1953^).  In  considering  linkage  between  the 
loci  affecting  a  metric  character  we  have  to  take  account  of  the  linkage 
phase.  We  may  say  that  two  genes  on  the  same  chromosome  are  in 
coupling  if  they  affect  the  character  in  the  same  direction,  and  in 
repulsion  if  they  affect  it  in  opposite  directions.  The  two  phases  will 
be  represented  in  equal  frequencies  in  a  random-breeding  population 
subject  to  no  selection,  as  was  shown  in  Chapter  1.  Now,  chromo- 
somes carrying  genes  in  coupling  will  contribute  more  to  the  variation 
than  chromosomes  carrying  genes  in  repulsion.  And  individuals  with 
intermediate  values  will  tend  on  the  whole  to  carry  repulsion  chromo- 
somes rather  than  coupling  chromosomes.  Therefore,  if  intermediates 
are  favoured  for  functional  reasons,  selection  will  favour  repulsion 
chromosomes  and  thus  tend  to  build  up  *  'balanced"  combinations  of 
genes:  that  is,  combinations  in  predominantly  repulsion  linkage, 
which  contribute  the  minimal  amount  of  variance.  In  this  way, 
according  to  Mather,  "potential"  genetic  variability  is  stored  in  latent 
form,  and  a  compromise  is  reached  between  the  conflicting  needs  of 
uniformity  in  adaptation  to  present  circumstances  and  flexibility  in 
adaptation  to  changing  circumstances. 

If,  however,  this  supposed  tendency  of  selection  to  build  up 
balanced  combinations  is  to  have  any  significant  effect  on  genetic 
variability  it  is  necessary  that  the  selection  should  be  strong  enough 
to  maintain  the  balanced  combinations  in  the  face  of  recombination 
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which  must  tend  continuously  to  reduce  them  to  a  random  arrange- 
ment. The  selective  forces  required  have  been  examined  by  Wright 
(1952&).  It  is  clear,  without  going  into  the  details,  that  coefficients  of 
selection  of  the  same  order  of  magnitude  as  the  recombination  fre- 
quencies would  be  required.  The  balancing  of  linkage  by  natural 
selection  therefore  seems  from  Wright's  reasoning  to  be  relevant  only 
to  very  short  segments  of  chromosome.  Loci  with  more  than  about  1 
per  cent  recombination  between  them  would  not  be  expected  to 
depart  significantly  from  a  random  arrangement,  unless  they  carried 
major  genes  with  large  effects  on  the  character.  Furthermore,  if  we 
consider  a  number  of  loci  on  the  same  chromosome,  it  is  not  clear 
how  much  difference  of  variance  would  be  expected  between  fully 
balanced  and  fully  random  arrangements;  it  might  well  be  very  little. 
Experimental  evidence  on  the  matter  is  scanty.  In  two  experiments, 
one  with  mice  and  the  other  with  Drosophila,  where  artificial  selection 
was  applied  for  and  against  intermediates,  no  changes  of  variance 
were  detected  (Falconer  and  Robertson,  1956;  Falconer,  19576). 
Intensification  of  the  selection  against  extremes  therefore  does  not 
seem  to  have  any  effect  on  the  variance  within  the  time-span  of  a 
laboratory  experiment. 

Canalisation.  Waddington's  theory  of  "canalisation"  is  con- 
cerned with  the  developmental  pathways  through  which  the  pheno- 
typic  values  come  to  their  expression  (see  Waddington,  1957).  If 
intermediates  are  favoured  because  of  their  values  of  the  metric 
character  in  question,  then  deviation  from  the  optimal  value  is  dis- 
advantageous. Selection  will  therefore  operate  against  the  causes  of 
deviation,  and  will  tend  to  produce  a  greater  stability  so  that  develop- 
ment is  canalised  along  the  path  that  leads  to  the  optimal  phenotypic 
expression.  The  role  ascribed  to  selection  is  its  discrimination  against 
alleles  that  increase  variability.  These  may  be  at  loci  that  affect  the 
character  in  question  or  at  other  loci.  Variation  both  of  environ- 
mental and  of  genetic  origin  may  be  reduced  in  this  way.  The 
genetic  variation  is  reduced  not  by  eliminating  the  segregation,  but  by 
rendering  the  organism  less  sensitive  to  the  effects  of  the  segregation. 
A  change  in  the  proportion  of  genetic  to  environmental  variation  is 
therefore  not  necessarily  to  be  expected.  As  a  consequence  of  canalisa- 
tion we  should  expect  to  find  some  characters  less  variable  than 
others,  the  less  variable  being  those  for  which  deviation  from  the 
optimum  has  the  more  serious  effect  on  fitness.  This  expected  con- 
sequence of  canalisation,  however,  cannot  easily  be  tested  experi- 
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mentally,  because,  as  Waddington  (1957)  points  out,  it  is  difficult  to 
find  a  logical  basis  for  comparing  the  variability  of  different  characters. 
Origin  of  variation  by  mutation.  Before  the  reasons  for  the 
existence  of  genetic  variability  can  be  fully  understood  it  will  be 
necessary  to  know  what  part  mutation  plays  in  restoring  what  is  lost 
by  random  drift  or  by  selection.  If  there  were  no  selection  of  any 
kind  then  the  amount  of  genetic  variation  would  come  to  equilibrium 
when  its  rate  of  origin  was  equal  to  its  rate  of  extinction  by  random 
drift.  The  rate  of  extinction  presents  no  very  serious  problem  be- 
cause we  need  know  the  population  size  only  approximately.  If, 
therefore,  we  knew  the  rate  of  origin  by  mutation  we  could  decide 
whether  a  significant  amount  of  the  existing  variation  can  be  ascribed 
to  mutation.  Very  little,  however,  is  known  about  the  rate  of  origin 
by  mutation.  The  only  evidence  comes  from  two  studies  oiDrosophila 
by  Clayton  and  Robertson  (1955)  and  Paxman  (1957),  which  yielded 
very  similar  results.  The  following  discussion  is  based  on  the  experi- 
ment of  Clayton  and  Robertson.  Selection  for  abdominal  bristle 
number  was  applied  to  an  inbred  line  derived  from  the  same  base 
population  on  which  the  other  studies  of  this  character  were  made. 
From  the  rate  of  response  to  selection  it  was  concluded  that  the  aver- 
age amount  of  variation  arising  by  spontaneous  mutation  in  one 
generation  amounted  to  one  thousandth  part  of  the  genetic  variation 
present  in  the  base  population.  In  other  words  it  would  take  about 
1000  generations  for  mutation  to  restore  the  genetic  variation  to  its 
original  level.  (We  may  note  in  passing  that  this  proves  mutation  to 
have  a  negligible  influence  on  the  response  of  non-inbred  populations 
to  artificial  selection,  apart  from  the  rare  occurrence  of  mutants  with 
major  effects.)  Now  consider  the  loss  of  variance  due  to  random  drift 
in  a  population  of  effective  size  Ne,  subject  to  no  selection.  If  all  the 
genetic  variance  is  additive,  as  it  very  nearly  is  in  the  case  of  bristle 
number,  then  the  rate  of  loss  per  generation  is  equal  to  the  rate  of  1 0f 
inbreeding,  which  is  ijzNe.  (This  follows  from  the  reasoning  given 
in  Chapter  15,  where  the  variance  within  a  line  was  shown  to  be 
(1  -F)  times  the  original  variance.)  Therefore  the  new  variation 
arising  by  mutation  at  the  rate  found  in  this  experiment  would  be  lost 
at  the  same  rate,  if  the  rate  of  inbreeding  were  1/1,000:  that  is,  in  a 
population  of  effective  size  500.  The  base  population  was  roughly 
ten  times  this  size  and  therefore  the  expected  rate  of  extinction  by 
random  drift  is  less  than  the  observed  rate  of  origin  by  mutation.  In 
other  words,  mutation  alone  seems  to  be  capable  of  accounting  for 
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more  variation  of  bristle  number  than  was  actually  present  in  the 
base  population.  Therefore  selection  favouring  heterozygotes  does 
not  seem  to  have  been  an  important  cause  of  the  genetic  variability  of 
bristle  number.  This  suggests  that  little  of  the  variation  of  bristle 
number  is  due  to  the  pleiotropic  effects  of  genes  that  affect  the  major 
components  of  fitness.  It  suggests,  in  other  words,  that  much  of  the 
variation  of  bristle  number  is  due  to  genes  that  are  not  far  from  being 
neutral  with  respect  to  fitness.  This  conclusion,  though  only  tenta- 
tive, is  in  line  with  the  fact,  mentioned  earlier,  that  bristle  number 
I  shows  little  tendency  to  revert  to  the  original  mean  value  when 
I  artificial  selection  is  relaxed.  The  conclusions  to  which  the  results  of 
I  this  experiment  point  cannot  yet  be  extended  to  other  characters. 
I  Characters  more  closely  connected  with  fitness,  when  they  have  been 
i|  studied  from  this  point  of  view,  may  present  a  very  different  picture. 
Evolutionary  significance  of  variability.  There  can  be  little 
doubt  that  the  existence  of  genetic  variation  is  advantageous  to  the 
evolutionary  survival  of  a  species,  the  advantage  it  confers  being  the 
ability  to  evolve  rapidly  and  so  to  meet  the  needs  of  a  changing 
environment,  both  through  the  course  of  time  and  in  the  colonisation 
of  new  localities.  Sexual  reproduction  and  outbreeding  are  necessary 
conditions  for  the  continued  existence  of  genetic  variation  and  it  is 
noteworthy  that  the  naturally  inbreeding  species  among  the  higher 
plants  are  of  comparatively  recent  origin.  This  suggests  that  the 
possession  of  genetic  variability  is  necessary  for  the  continued  exist- 
ence of  a  species  over  a  long  period  of  time;  or  in  other  words,  that  the 
prevalence  of  genetic  variability  among  existing  species  is  because 
those  without  it  have  not  survived.  The  inbreeding  plants,  however, 
as  we  see  them  at  present,  compete  successfully  with  the  outbreeding 
species,  and  this  proves  that  the  possession  of  genetic  variability  does 
not  confer  much  immediate  advantage.  The  evolutionary  significance 
of  genetic  variability,  however,  throws  no  light  on  the  mechanisms 
that  maintain  it.  It  is  these  mechanisms,  which  have  been  discussed 
in  this  chapter,  that  are  the  concern  of  quantitative  genetics. 


The  Genes  concerned  with  Quantitative   Variation 

The  genetic  variation  of  metric  characters  appears  from  the  re- 
sults of  experimental  selection  to  be  the  product  of  segregation  at 
some  hundreds  of  loci,  or  more  probably  some  thousands  if  the 
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variation  of  all  characters  is  included.  So  natural  populations  prob- 
ably carry  a  variety  of  alleles  at  a  considerable  proportion  of  loci,  even 
perhaps  at  virtually  every  locus.  It  seems  unreasonable,  therefore,  to 
think  of  genes  having  the  control  of  a  metric  character  as  their 
specific  function:  we  cannot  reasonably  suppose  that  there  are  genes 
whose  only  functions  are  the  adjustment  of,  say,  body  size  to  an 
optimal  value.  How,  then,  are  we  to  think  of  the  genes  with  which  we 
are  concerned  in  quantitative  genetics?  Our  knowledge  of  these 
genes  may  be  briefly  summarised  as  follows. 

The  distinction  between  ' 'major"  and  "minor"  genes  marks  the 
difference  between  those  which  we  can  study  individually,  and  whose 
properties  are  therefore  fairly  easily  discovered,  and  those  which  we 
cannot  study  individually  and  whose  properties  can  only  be  deduced 
by  indirect  means.  Both  are  concerned  with  quantitative  variation. 
Among  the  major  genes  two  sorts  may  be  distinguished.  There  are 
genes  with  more  or  less  severely  deleterious  effects  on  fitness,  and 
these  include  nearly  all  the  "mutants"  of  Mendelian  genetics,  as  well 
as  lethals.  Each  may  have  pleiotropic  effects  on  a  variety  of  metric 
characters.  They  are  recessive,  or  nearly  so,  in  their  effects  on  fitness, 
but  not  necessarily  also  in  their  effects  on  metric  characters.  They  are 
kept  in  equilibrium  at  low  frequencies  by  natural  selection  balanced 
against  mutation.  Being  at  low  frequencies  they  contribute,  individu- 
ally, little  to  the  genetic  variance  of  any  character;  their  total  contri- 
bution, however,  is  unknown.  They  are  probably  an  important  cause 
of  inbreeding  depression.  Major  genes  of  the  second  sort  are  those 
responsible  for  the  antigenic  differences.  The  alleles  at  these  loci  are 
at  intermediate  frequencies  where  they  are  probably  maintained  by 
selection  favouring  heterozygotes.  Their  effects  on  fitness,  however, 
are  probably  fairly  small — certainly  small  enough  for  all  to  be 
regarded  as  "wild-type"  alleles.  Their  effects  on  metric  characters 
are  almost  unexplored,  and  their  importance  as  sources  of  variation  is 
consequently  unknown.  They  presumably  contribute  to  inbreeding 
depression  if  heterozygotes  are  superior  in  fitness,  but  again  their 
relative  importance  in  this  respect  is  not  known  with  certainty. 
About  the  minor  genes  little  is  known.  They  do  not  necessarily 
occupy  loci  different  from  those  occupied  by  major  genes.  It  seems 
more  likely,  on  the  contrary,  that  they  are  isoalleles,  capable  of 
mutating  to  major  deleterious  genes.  They  are  performing  their 
primary  functions  perfectly  adequately  and  may  differ  only  in  the  rate 
at  which  their  primary  product  is  synthesised.    The  variation  of 


Chap.  20] 


THE  GENES  CONCERNED 


345 


metric  characters  which  they  produce  may  be  quite  incidental  to  their 
main  biochemical  functions.  There  is  no  reason  at  present  to  think 
that  these  minor  genes  differ  in  any  essential  way  from  the  genes  that 
determine  antigenic  differences.  The  fact  that  their  effects  are  not 
individually  recognisable,  whereas  the  antigenic  differences  are,  may 
be  due  only  to  the  inadequacy  of  the  techniques  available  for  detect- 
ing biochemical  differences  among  essentially  normal  individuals. 

The  problems  that  have  been  raised  but  left  unanswered  in  this 
chapter  will  be  sufficient  indication  of  the  directions  which  the  future 
development  of  quantitative  genetics  may  take.  It  does  not  seem  to 
the  present  writer  that  much  progress  toward  their  solution  is  likely 
to  be  made  by  deductive  reasoning,  because  most  of  the  outstanding 
problems  are  not  essentially  theoretical  in  nature:  the  theoretical 
structure  of  the  subject  is  now  fairly  clear,  at  least  in  its  main  out- 
lines. Some  of  the  outstanding  problems  are  beyond  the  reach  of  the 
experimental  techniques  now  at  our  command.  New  techniques,  both 
more  penetrating  and  more  discriminating,  will  therefore  be  needed. 
Other  problems  arise  from  the  paucity  of  experimental  data  and  the 
consequent  difficulty  of  deciding  what  phenomena  are  general  and 
what  are  due  to  special  circumstances.  These  problems  will  be  solved 
not  so  much  by  deliberately  designed  experiments,  but  rather  from 
the  accumulated  experience  of  experiments  extended  to  a  wider 
variety  of  characters  and  of  organisms. 
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GLOSSARY   OF   SYMBOLS 

This  list  gives  the  meanings  of  most  of  the  symbols  used  in  the  book. 
Many  of  the  symbols  listed  are  used  also  with  other  meanings  in  certain 
places,  but  these  meanings,  as  well  as  the  symbols  not  listed,  do  not  appear 
more  than  a  page  or  two  removed  from  their  definition.  The  more  im- 
portant differences  from  current  usage  are  indicated  where  the  equivalent 
symbols  used  by  Lerner  (1950) — denoted  by  (L) — and  by  Mather  (1949) 
— denoted  by  (M) — are  given. 

Ax,  A2  Allelomorphic  genes. 

A  Breeding  value.     =  G  (L). 

a  Genotypic  value  of  the  homozygote  A^,  as  deviation  from  the 

mid-homozygote  value.    =  d  (M). 

a  Average  effect  of  a  gene-substitution. 

ax,  a2    Average  effects  of  the  alleles  Ax  and  A2  respectively. 

b  Regression  coefficient;  e.g.  &op  =  regression  of  offspring  on  parent. 

CR       Correlated  response  to  selection. 

D  Dominance  deviation. 

d  Genotypic  value  of  the  heterozygote  AXA2,  as  deviation  from  the 

mid-homozygote  value.    =  /z|(M). 

A  Change  of  -,  as  Aq  =  change  of  gene  frequency,  Zlir  =  rate  of  in- 

breeding. 

E  Environmental  deviation. 

Ec        Common  environment;  i.e.  environmental  deviation  of  family  mean 
from  population  mean.    =  C  (L). 

Ew       Within-family  environment;  i.e.  environmental  deviation  of  indi- 
vidual from  family  mean.    =  E'  (L). 

F  Coefficient  of  inbreeding. 

F1         First  generation  of  cross  between  lines  or  populations. 

F2         Second  generation  of  cross,  by  random  mating  among  Fx. 

FS        Full  sibs. 

/  Coancestry;  i.e.  inbreeding  coefficient  of  the  progeny  of  the  indi 

viduals  concerned. 

/  (Chap.  13):  Subscript  referring  to  selection  between  families. 

G  Genotypic  value.     =  Ge  (L). 
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H 
H 

HS 

h* 

I 

I 


M 
m 

N 

N 
Ne 
n 

O 
P 
P 
P 
P 
P 
P 

Q 

q 

R 


Frequency  of  heterozygous  genotype  (AXA2). 

Amount  of  heterosis;  i.e.  deviation  of  cross  mean  from  mid-parent 

value. 
Half  sibs. 
Heritability. 

Interaction  deviation,  due  to  epistasis. 
(Chap.  13  &  19):  Index  for  selection. 
Intensity  of  selection;   i.e.   selection  differential  in  units  of  the 

phenotypic  standard  deviation.    =  1  (L). 
Population  mean. 
Immigration  rate. 
Population  size;  i.e.  number  of  breeding  individuals  in  a  population 

or  line. 
(Chap.  10  &  13):  Number  of  families. 
Effective  population  size. 
Number  in  various  contexts.    In  Chapters  10  and  13,  specifically 

number  of  offspring  per  family. 
Offspring 

Parent.  P  =  Mid-parent. 
Frequency  of  homozygous  genotype  (A^). 
Panmictic  index,  ( =  1  -  F). 
Phenotypic  value. 
Gene  frequency  (of  Ax).    =  u  (M). 
(Chap.  11,  part):  proportion  selected  as  parents  from  a  normally 

distributed  population.    =  v  (L). 
Frequency  of  homozygous  genotype  (A2A2). 
Gene  frequency  (of  A2).    =  v  (M). 
Response  to  selection — specifically  to  individual  selection.    =  AG 

(L). 
(Chap.  8):  Repeatability;  i.e.  correlation  between  repeated  measure- 
ments of  the  same  individual. 
(Chap.  13):  Coefficient  of  relationship;  i.e.  correlation  of  breeding 

values  between  related  individuals.    =  rG  (L). 
(Chap.  19):  Correlations  between  two  characters: 
rA  additive  genetic  correlation.    =  rG  (L). 
rE  environmental  correlation. 
rP  phenotypic  correlation.     =  r  (L). 

Selection  differential  in  actual  units  of  measurement.    =  i  (L). 
Coefficient  of  selection  against  a  particular  genotype. 
(Chap.  13):  subscript  referring  to  sib-selection. 
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E          Summation  of  the  quantity  following  the  sign. 
a  Standard  deviation  (a2  =  variance)  of  the  quantity  indicated  by 

subscript.   Components  of  variance,  from  an  analysis  of  vari- 
ance are  indicated  by  subscripts  as  follows: 
a%  between  groups,  or  families. 
o%  between  dams,  within  sires, 
of  between  sires. 

a\  total;  i.e.  the  sum  of  all  components. 
o\  within  groups,  or  families. 
t  Time  in  number  of  generations.    As  a  subscript  it  means  "at 

generation  t". 
t  Phenotypic  correlation  between  members  of  families. 

u  Mutation  rate  (from  Ax  to  A2). 

V  Variance  (causal  component)  of  the  value  or  deviation  indicated  by 

subscript.  The  most  important  are: 

VP  Phenotypic  variance.  =  o%  (L),    =  V  (M). 

Vq  Genotypic  variance.  =  o%e  (L). 

Vj  Additive  genetic  variance.  =  al  (L),    =  \T>  (M). 

Vj)  Dominance  variance.)  2   .      (=^H(M). 

Vi  Interaction  variance.  J  G         \  =  /  (M). 

VE  Environmental  variance.    =ct^(L),    =E(M). 
v  Mutation  rate  (from  A2  to  Ax). 

w  (Chap.  13):  subscript  referring  to  selection  within  families. 

X  (Chap.  19):  One  of  two  correlated  characters. 

Y  (Chap.  19):  The  other  of  two  correlated  characters. 

y  (Chap.  14):  Difference  of  gene  frequency  between  two  lines. 

z  (Chap.   11):  Height  of  the  ordinate  of  a  normal  distribution,  in 

units  of  the  standard  deviation. 
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212-5; 
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difference      between      definitions, 
158. 

Canalisation,  272,  308,  341. 

cats,  18-19. 

cattle,  dropsy  in,  13. 

causal  components  of  variance,  150. 

Cepaea  nemoralis,  43,  78-9,  83-4. 
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coefficient: 

of  inbreeding,  see  under  Inbreed- 
ing; 

of  relationship,  233; 

of  selection,  28. 
combining  ability,  281-6. 
continuous  variation,  104-1 1 . 
correlated  characters,  312-29,  335-6. 
correlation      (between      characters), 
312-29; 


genetic,  313-8.   / 
correlation  (between  relatives): 

of  breeding  valWs,  233; 

phenotypic,  151X162-3. 
covariance,  15 1-2; 

environmental,  159-61; 

genetic,  152-9, 

offspring-parent,  152-6, 
sibs,  154,  156-7;  \ 

phenotypic,  16 1-4. 
crossbreeding: 

heterosis,  255-63; 

in  plant  and  animal  improvement, 
276-86; 

variance  between  crosses,  279-83. 

Developmental  variation,  141,  143. 
deviation: 

dominance,  122-5; 

environmental,  112; 

interaction  (epistatic),  125-8. 
discontinuous    variation,    104,    108, 

301. 
dispersive    process,    23,    47-8    (see 

also  Inbreeding), 
dominance,  27,  113; 

deviation,  122-5; 

directional,  213; 

effect  on  variance,  137; 

and  fitness,  337; 

and  heterosis,  257; 

and  inbreeding  depression,  251; 

and  scale,  298. 
drift,  random,  50-7; 

in  natural  populations,  81-4. 
Drosophila  melanogaster : 

Bar,  80-1,  107; 

bristle  number: 

components    of   variance,    140, 

148, 
fitness  relationship,  333,  343, 
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frequency  distribution,  107, 
heritability,  169-70,  177, 
mutation,  342, 
number  of  "loci",  219, 
random  drift,  265, 
repeatability,  145, 
response  to  selection,  190,  195, 
209,  210,  216,  221,  223,  245- 
246; 
brown,  52,  53,  56,  59; 
effective  population  size,  73-4; 
egg  number,  140,  282,  336; 
ovary  size,  140,  145; 
raspberry,  34; 
thorax  length: 

components    of   variance,    130, 

140, 
response  to  selection,  209,  211- 
212,  216,  219,  221,  319; 
wing  length,  17 1-2,  319. 
Drosophila  pseudoobscura,  262. 
Drosophila  subobscura,  252. 
Drosophila  tropicalis,  39. 
dwarfism      (chondrodystrophy)      in 
man,  38. 

Effective  factors,  number  of,  217. 
effective  population  size  (number), 

<   68-74; 
ratio  of,  to  actual  number,  73-4. 
environment,     112    {see    also    under 
Variance); 
common,  159-61. 
epistasis,   126  {see  also  under  Inter- 
action), 
equilibrium: 

Hardy- Weinberg,  9-12; 
under  inbreeding,  74-81, 

and  selection  for  heterozygotes, 
100-3; 
with  linked  loci,  20-1; 
with  more  than  one  locus,  19-20; 
under  mutation,  25-6, 

with  selection,  36-41; 
under  natural  selection,  331-2; 
under  selection  for  heterozygotes, 
41-6. 


eugenics,  36,  40. 
euheterosis,  262. 

Factors,  effective  number  of,  217. 
family  size: 

and  heritability  estimates,  177-83; 

and  inbreeding,  70-3; 

and  selection,  233-46. 
fitness,  natural,  26,  167,  329,  330-43. 
fixation,  54-7,  66-7,  97; 

of  deleterious  genes,  80-1. 

Gene  frequency,  6-22; 
change  of,  23-36, 

by  selection  for  metric  character, 
203-7; 
directional,  213; 
distributions  of: 

with  inbreeding,  52,  55,  84, 
with  inbreeding  and  mutation, 

76, 
with  inbreeding  and  selection, 
79,  81; 
effect  on  variance,  137; 
sampling  variance  of,  50-4,  64. 
generation  interval,  196-8. 
genetic  death,  39-40. 
genotype,  112; 
frequencies,  5-7, 

with  inbreeding,  57-9,  65-6; 
with  random  mating,  9-22. 
genotype-environment     correlation, 

132-3- 
genotype-environment      interaction, 

133-4,  148-9,  322-4. 
genotypic  value,  11 2-4,  123-5,  x32. 

Hardy- Weinberg  law,  9-15. 
heritability,  135,  163,  165-7; 

estimation,  168-85; 

examples,  167-8; 

of  family  means,  232-7; 

after  inbreeding,  268; 

precision  of  estimates,  177-83; 

realised,  202-3,  296-7; 

of  threshold  characters,  303; 

of  within-family  deviations,  232-7. 


SUBJECT  INDEX 


363 


heterosis,  254; 

examples,  260; 

in  single  crosses,  255-61; 

in  wide  crosses,  261-3; 

utilisation  of,  276-86. 
heterozygOtes: 

frequency  of, 

with  inbreeding,  65-6, 

with  random  mating,  9-13,  38; 

selection  for,  see  under  selection, 
homeostasis: 

developmental,  270-2; 

genetic,  331. 
hybrid  vigour,  see  heterosis, 
hybrids,  uniformity  of,  270-2,  275, 
296. 

Idealised  population,  48-50. 
inbreds: 

experimental  use,  272,  275; 
sub-line  differentiation,  273-4; 
variability,  270-1,  296. 
inbreeding,  60-7; 
coefficient,  61; 
computation, 

from  pedigrees,  86-8, 
from  population  size,  61-4, 
for  regular  systems,  90-5; 
depression,  247-54, 

examples,  249; 
rate  of,  63,  69-70,  92,  96,  101-2; 
regular  systems,  90-5; 
and  variance,  265-72. 
index  for  selection,  325-8. 
incidence  of  threshold  character,  302. 
I  intangible  variation,  141. 
J  integration,  263. 
intensity  of  selection,  192. 
interaction: 

between    loci    (epistatic),     125-8, 
263, 
and  heterosis,  259,  262, 
and  inbreeding,  252, 
and  scale,  298-9; 
deviation,  125-8; 

between    genotype    and    environ- 
ment, 133-4,  148-9,  322-4. 


island  model,  77. 
isoalleles,  344. 
isolation  by  distance, 
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Line  (subdivision  of  a  population), 

49- 
linkage: 

and  correlation,  312,  320; 

and  HarqLy-Weinberg  equilibrium, 

20-21; 
and  inbreeding,  97-100; 
and  polygenic  balance,  340-1; 
and    resemblance    between    rela- 
tives, 158-9. 
logarithmic  transformation,  297. 
luxuriance,  262. 
Ly  coper  sicon,  260,  300. 

Maize,  277,  290. 
Man: 

albinism,  13,  36; 
birth  weight,  141-2; 
blood  groups,  5,  7,  12,  16,  44; 
dwarfism  (chondrodystrophy),  38; 
sickle-cell  anaemia,  44-6. 
maternal   effects,    140-2,    160,    214, 

252-3,  260-1. 
mating,  types  of,  15. 
metric  character,  104-11. 
migration,  23; 

in  small  populations,  75-9. 
mouse: 

blood-pH,  321; 
body  weight: 

fitness  relationship,  335-6, 
number  of  "loci",  219, 
realised  heritability,  203, 
response  to  selection,  199,  214, 

216,  220, 
selection  differentials,  201, 
sib-analysis,  175-6, 
variance  and  scale,  295; 
growth  rate,  107,  133; 
litter  size: 

frequency  distribution,  107, 

heterosis,  255, 

inbreeding  depression,  252-3, 
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repeatability,  144; 
non-agouti,  51; 

pigment  granules,  116-7,  126-8; 
pygmy,  113-5,  120-3,  136,  222-3, 

289,  299; 
sex  ratio,  321; 
skeletal  variants,  274; 
vertebrae,  number  of,  273-4,  3°5— 
308. 
multiple  alleles,  15-17,  42,  138. 
multiple  measurements,  142-9. 
mutation,  23; 

balanced  against  selection,  36-41; 
change  of  gene  frequency  by,  24-6; 
and  inbreeding,  75-9,  100,  274-5; 
and  origin  of  variation,  342-3; 
rate: 

estimation  of,  38, 
increase  of,  26,  39. 

Neighbourhood  model,  77-9. 
Nicotiana,  260. 
non-additive: 

combination  of  genes,  125-8; 

variance,  139-40,  280,  287,  337. 

Observational  components  of  vari- 
ance, 150. 
overdominance,  27,  287-91; 
effect  on  variance,  137; 
equilibrium  gene  frequency,  41-6; 
and  fitness,  339,  344; 
in  selection  experiments,  213,  222- 
223. 

Panmictic  index,  64,  66. 

panmixia,  8. 

pedigrees  and  inbreeding,  85-90. 

pigs: 

body-length,  174-5; 

litter  size,  253. 
pleiotropy,    289-91,    312-3,    328-9, 

333- 
polycross,  282. 
polygenes,  106. 
polygenic  balance,  340-1. 
polygenic  variation,  106. 


population: 

base-,  49,  61,  95-6; 

effective  size  (number),  68-74, 
ratio  of,  to  actual  size,  73-4; 

-mean,  113-7; 

size,  50. 
premisses,  2,  3. 
probit  transformation,  302. 
progeny  testing,  229-30. 
proportionate  effect,  207,  219. 

Quantitative  character,  104. 
Quasi-continuous  variation,  301. 

Radiation,  26,  39. 
random  drift,  51-7; 

in  natural  populations,  81-4. 
random  mating,  8-21. 
range,  total,  115,  116,  215-9. 
regression,     offspring     on    parents, 

151,  162-3. 
relatives,       resemblance       between, 

150-64. 
repeatability,  143-9. 

Scale,  108-9,  292-300; 
-effects,  293; 
underlying,  301. 
segregation  index,  217. 
selection,  23,  26,  186-7; 

balanced  by  mutation,  36-41; 
change  of  gene  frequency,  28-36; 
coefficient  of,  28, 

related  to  intensity  of,  203-7; 
combined,  227,  236-7,  239-40; 
for  combining  ability,  283-6; 
correlated  response  to,  318-24; 
in  different  environments,  322-4; 
-differential,  187,  191-8, 

weighting  of,  200-2; 
for  economic  value,  324-9; 
eugenic  effects,  36,  40-1; 
family,   227-8   (see  also  Selection, 
methods), 
and  family  size,  243-5; 
for  heterozygotes,  41-6,  213,  222- 
223,  339, 
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affecting      inbreeding,      100-3, 

253-4; 
-index,  325-8; 
indirect,  320-4; 
individual,  227  (see  also  Selection, 

methods); 
intensity  of,  192, 

related   to    coefficient   of,    203- 

71 

for  intermediates,  338-42; 

-limit,  215,  219-24,  328-9; 

long-term  results,  215-4; 

mass,  227; 

methods  (use  of  relatives),  225-31, 
heritabilities,  232-6, 
relative  merits,  237-44, 
responses  expected,  231-7; 

natural,  187,  200-2,  212,  253,  266, 

329-43; 
reciprocal,  284; 
recurrent,  283,  286; 
response,  187-91, 

asymmetry,  212-5,  296-7, 
duration,  215-7, 
measurement,  198-203, 
number  of  "loci",  217-9, 
prediction,  189-91,  214-5, 
repeatability,  208-12, 
total,  215-7; 
sib,      229      (see     also     Selection, 

methods); 
in  small  populations,  79-81; 
for  threshold  characters,  308-11; 
variable,  339-40; 

within-family,  227  (see  also  Selec- 
tion, methods), 
selective  value,  26. 
self-fertilising  plants,  247,  276-7. 
sex-linked  genes,  17-19,  34. 
sickle-cell  anaemia,  44-6. 
sib-analysis,  172-6. 
snails,  43,  78-9,  83-4. 


systematic  processes,  23, 

in  small  populations,  74-81. 

Threshold  characters,  301-11,  321. 
transformation  of  scale,  108-9,  292- 
300; 

logarithmic,  297; 

probit,  302. 
tobacco,  260. 
tomato,  260,  300. 
top-cross,  282. 
twins,  131,  183-5. 

Uniformity  of  inbred  lines,  54,  66-7, 
97,  100-3. 

Value: 

genotypic,  11 2-4,  123-5; 
phenotypic,  112. 
variance: 

additive,  135-8; 
between  crosses,  279-83; 
components,  129-30, 

causal,  150, 

genetic,  134-4°, 

observational,  150; 
dominance,  135-8,  163, 
environmental,  130-4,  140-9, 

common,  159-61, 

general,  143-9, 

inbreeding  effects,  270-2, 

special,  143-9; 
genotypic,  130-4; 
inbreeding  effects,  265-72; 
interaction  (epistatic),  138-40, 

and  resemblance  between  rela- 
tives, 157-9; 
non-additive,    139-40,    280,    287, 

337- 
variation: 

continuous,  104-1 1 ; 
discontinuous,  104,  108,  301; 
quasi-continuous,  301. 
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